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Reporter Gene System For Use In Cell-Based Assessment 
Of Inhibitors Of The Hepatitis C Virus Protease 

Technical and Industrial Applicability of Invention 

5 

A cell-based assay system in which the detection of reporter gene activity 
(secreted alkaline phosphatase or SEAP) is dependent upon active Hepatitis C virus 
(HCV) NS3 protease. The assay system is useful in the in vitro screening, in a 
mammalian cell-based assay, of potential protease inhibiting molecules useful in the 
10 treatment of HCV. The advantages of using SEAP over more routinely used reporter 
genes such as beta-galactosidase or luciferase, is that a cell lysis step is not required 
since the SEAP protein is secreted out of the cell. The absence of a cell lysis step 
decreases intra- and inter-assay variability as well as makes the assay easier to 
perform then earlier assays. 

15 

Background of The Invention 

HCV is one of the major causes of parenterally transmitted non-A, non- 
B hepatitis worldwide. HCV is now known as the etiologic agent for Non-A 

20 Non-B hepatitis throughout the world. Mishiro et al., U.S. Patent No. 

5,077,193; Mishiro et al., U.S. Patent No. 5,176,994; Takahashi et al, U.S. 
Patent No. 5,032,511; Houghton et al., U.S. Patent Nos. 5,714,596 and 
5,712,088; as well as (M. Houghton, Hepatitis C Viruses, p.1035-1058 in B.N. 
Fields et al.(eds.), Field's Virology (3d. ed. 1996). HCV infection is 

25 characterized by the high rate (>70%) with which acute infection progresses to 
chronic infection (Alter, M. J. 1995. Epidemiology of hepatitis C in the west. 
Sem. Liver Dis. 15:5-14.). Chronic HCV infection may lead to progressive liver 
injury, cirrhosis, and in some cases, hepatocellular carcinoma. Currently, 
there are no specific antiviral agents available for the treatment of HCV 

30 infection. Although alpha interferon therapy is often used in the treatment of 
HCV-induced moderate or severe liver disease, only a minority of patients 
exhibit a sustained response Saracco, G. et al., J. Gastroenterol. Hepatol. 
10:668-673 1995. Additionally, a vaccine to prevent HCV infection is not yet 
available and it remains uncertain whether vaccine development will be 

35 complicated by the existence of multiple HCV genotypes as well as viral 
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variation within infected individuals Martell, M. et a!., J. Virol. 66:3225-3229 
1992; Weiner, et aL, Proc. Natl. Acad. Sci. 89:3468-3472 1992. The presence 
of viral heterogeneity may increase the likelihood that drug resistant virus will 
emerge in infected individuals unless antiviral therapy effectively suppresses 
5 virus replication. Most recently, several of the HCV encoded enzymes, 
specifically the NS3 protease and NS5B RNA polymerase, have been the 
focus of intensive research, in vitro screening, and/or rational drug design 
efforts. 

10 HCV has been classified in the flavivirus family in a genus separate 

from that of the flaviviruses and the pestiviruses. Rice, C. M., in B. N. Fields 
and P. M. Knipe (eds.), Virology, 3rd edn., p. 931-959;1996 Lippincott-Raven, 
Philadelphia, PA. Although the study of HCV replication is limited by the lack 
of an efficient cell-based replication system, an understanding of replicative 

1 5 events has been inferred from analogies made to the flaviviruses, pestiviruses, 
and other positive strand RNA viruses. The HCV virus has a 9.4 kb single 
positive-strand RNA genome encoding over 3,000 amino acids. The genome 
expresses over 10 structural and non-structural proteins. Post-translational 
processing of the viral genome requires cleavage by two proteases. As in the 

20 pestiviruses, translation of the large open reading frame occurs by a cap- 
independent mechanism and results in the production of a polyprotein of 3010- 
3030 amino acids. Proteolytic processing of the structural proteins (the 
nucleocapsid protein or core (C)) and two envelope glycoproteins, E1 and E2 
is accomplished by the action of host cell signal peptidases. Santolini, E., et 

25 aL, J. Virol. 68:3631-3641, 1994; Ralston, R., et aL, J. Virol. 67:6753-6761 
1993. Cleavage of the nonstructural proteins (NS4A, NS4B, NS5A, and 
NS5B) is mediated by the action of the NS2/3 protease or the NS3 protease. 
Grakoui, A. et aL, J. Virol. 67:2832-2843 1993; Hirowatari, Y., et aL, Anal. 
Biochem. 225:1 13-120 1995; Bartenschlager, R. et aL, J. Virol. 68:5045-5055 

30 1994; Eckart, M. R., et aL, Biochem. Biophvs. Res. Comm. 192:399-406 1993; 
Grakoui, A., et aL, J. Virol. 67:2832-2843 1993; Tomei, L, et aL, J. Virol. 
67:4017-40261993; NS4A is a cofactor for NS3 and NS5B is an RNA 
dependent RNA polymerase. Bartenschlager, R. et aL, (1994); Failla, C.,et aL, 
. J. Virol. 68:3753-3760 1994; Lin, C. et aL, Proc. Natl. Acad. Sci. 92:7622- 
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7626 1995; Behrens, S.-E., et al., EMBO J. 15:12-22 1996. Functions for the 
NS4B and NS5A proteins have yet to be defined. 

The NS2/3 is a metalloprotease and has been shown to mediate cleavage at 
5 the 2/3 junction site Grakoui, et al. (1993); Hijikata, M., et al., J. Virol. 67:4665-4675 
1993. In contrast, the NS3 protease is required for multiple cleavages within the 
nonstructural segment of the polyprotein, specifically the 3/4A, 4A/4B, 4B/5A, and 
5A/5B junction sites Bartenschlager et al. (1993); Eckart, M. R., et al., Biochem. 
Biophys. Res. Comm. 192:399-406 1993; Grakoui et al. (1993); Tomei et al. (1994). 

1 0 More recently, it is thought that the NS2/3 protease might actually be part of the HCV 
NS3 protease complex even though they have two functionally distinct activities. 
Although NS3 protease is presumed to be essential for HCV viability, definitive proof 
of its necessity has been hampered by the lack of an infectious molecular clone that 
can be used in cell-based experiments. However, recently two independent HCV 

15 infectious molecular clones have been developed and have been shown to replicate 
in chimpanzees. Kolykhalov, A. A., et al., Science 277:570-574 1997; Yanagi, M., et 
al., Proc. Natl. Acad. Sci. 94:8738-8743 1997. The requirement for NS3 in the HCV 
life cycle may be validated in these clones by using oligo nucleotide-mediated site 
directed mutagenesis to inactivate the NS3 catalytic serine residue and then 

20 determining whether infectious virus is produced in chimpanzees. Until these 
experiments are performed, the necessity of NS3 is inferred from cell-based 
experiments using the related yellow fever (YFV) and bovine viral diarrhea (BVDV) 
viruses. Mutagenesis of the YFV and BVDV NS3 protease homologs has shown that 
NS3 serine protease activity is essential for YFV and BVDV replication. Chambers, T. 

25 J., et al., Proc. Natl. Acad. Sci. 87:8898-8902 1990; Xu, J., et al., J. Virol. 71:5312- 
5322 1997. 

In general, when investigators screen potential anti-viral compounds for 
inhibitory activity, it usually involves initial in vitro testing of putative enzyme inhibitors 

30 followed by testing the compounds on actual infected cell lines and animals. It is 
obvious that working with live virus in large scale screening activities can be 
inherently dangerous and problematic. While final testing of putative inhibitors in 
infected cells and animals is still necessary for preclinical drug development, for initial 
screening of candidate molecules, such work is cost-prohibitive and unnecessary. 

35 Furthermore, the inability to grow HCV in tissue culture in a reproducible quantitative 
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manner prevents the evaluation of potential antiviral agents for HCV in a standard 
antiviral cytopathic effect assay. In response to this real need in the industry, 
development of non-infectious, cell-based, screening systems is essential. 

5 For example, Hirowatari, et al. developed a reporter assay system, inter alia, 

that involves the transfection of mammalian cells with two eukaryotic expression 
plasmids. Hirowatari, et al., Anal. Biochem. 225:1 13-120 1995. One plasmid has 
been constructed to express a polyprotein that encompasses the HCV NS2-NS3 
domains fused in frame to an NS3 cleavage site followed by the HTLV-1 TAX1 

10 protein. A second plasmid has been constructed to have the expression of the 
chloramphenicol acetyltransferase (CAT) reporter gene under the control of the 
HTLV-1 LTR. Thus when COS cells are transfected with both plasmids, NS3- 
mediated cleavage of the TAX1 protein from the NS2-NS3-TAX1 polyprotein allows 
the translocation of TAX1 to the nucleus and subsequent activation of CAT 

15 transcription from the HTLV-1 LTR. CAT activity can be measured by assaying the 
acetylation of 14 C-chloramphenicol through chromatographic or immunological 
methods. In the CAT assay generally, cell extracts are incubated in a reaction mix 
containing 14 C- or 3 H-labeled chloramphenicol and n-Butyryl Coenzyme A. The CAT 
enzyme transfers the n-butyryl moiety of the cofactor to chloramphenicol. For a 

20 radiometric scintillation detection (LSC) assay, the reaction products are extracted 

with a small volume of xylene. The n-butyryl chloramphenicol partitions mainly into the 
xylene phase, while unmodified chloramphenicol remains predominantly in the 
aqueous phase. The xylene phase is mixed with a liquid scintillant and counted in a 
scintillation counter. The assay can be completed in as little as 2-3 hours, is linear for 

25 nearly three orders of magnitude, and can detect as little as 3 x 10" 4 units of CAT 
activity. CAT activity also can be analyzed using thin layer chromatography (TLC). 
This method is more time-consuming than the LSC assay, but allows visual 
confirmation of the data. 

30 Similarly, the other patents of Houghton, et al., U.S. Patent No. 5,371,017, 

U.S. Patent No. 5,585,258, U.S. Patent No. 5,679,342 and U.S. Patent No. 5,597,691 
or Jang et al. WO 98/00548 all disclose a cloned NS3 protease or portion fused to a 
second gene encoding for a protein which a surrogate expression product can be 
detected for example, in the '017 patent of Houghton , b-galactosidase, superoxide 

35 dismutase, ubiquitin or in Jang, the expression is measured by the proliferation of 
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poliovirus in cell culture) and its use for candidate screening. It is unclear in the 
Houghton, et al. patents, however, whether the protease described in the specification 
is the NS2/3 metalloprotease or NS3 serine protease. Although the serine protease is 
claimed, the experimental data show putative cleavage of the N-terminal SOD fusion 
5 partner at the NS2/3 junction, a function which recently has been deemed to be the 
domain of the NS2/3 metalloprotease (Rice, CM., et al., Proc. Nat. Acad. Sci. 
90:10583-10587 (1993)). Furthermore, an active soluble NS3 serine protease is not 
disclosed in the Houghton, et al. patents, but a insoluble protein derived from E. coli 
inclusion bodies and which was N-terminally sequenced. For purposes of the present 
1 0 invention the term "NS2 protease" will refer to the enzymatic activity associated with 
the NS2/3 metalloprotease as defined by Rice et al. T and the term "NS3 protease" will 
refer to the serine protease located within the NS3 region of the HCV genome. 

De Francesco et al., U.S. Patent No. 5,739,002, also describes a cell free in 
1 5 vitro system for testing candidates which activate or inhibit NS3 protease by 

measuring the amount of cleaved substrate. Hirowatari et al. (1995) discloses 
another HCV NS3 protease assay, however, it differs from the present invention in 
several aspects, including the reporter gene, the expression plasmid constructs, and 
the method of detection. Recently, Cho et al. describe a similar SEAP reporter 
20 system for assaying HCV NS3 protease which also differs in its structure and function 
from the present invention. Cho et al., J. Virol. Meth. 72:109-1 15 1998. Also of 
interest is a NS3 protease assay system developed by Chen et al. in WO 98/37180. 
In the Chen et al. application, a fusion protein is described which uses NS3 protease 
polypeptide or various truncation analogs fused to the NS4A polypeptide or various 
25 truncation analogs and is not autocleavable. The fusion protein is then incubated with 
known substrates with or without inhibitors to screen for inhibitory effect. 

There are a number of problems inherent in all the abovementioned assay 
systems. For example, the reporter gene product or analyte is many steps removed 
30 from the initial NS3 protease cleavage step, the cells used in the assay system are 
prokaryotic or Yeast based and must be lysed before the reporter gene product can 
be measured, and the surrogate marker is proliferation of live virus. All of these 
problems are overcome in the present invention as summarized below. 
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Summary of Invention 

The present invention describes a reporter gene system for use in the cell 
based assessment of inhibitors of the HCV protease. Applicants point out that 
5 throughout the description of this invention, the reference to specific non-structural 
(NS) regions or domains of the HCV genome are functional definitions and 
correspond approximately to the defined sequence locations described by CM. Rice 
and others. The present invention discloses the co-transfection of a target cell line 
with a viral vector which has been engineered to express from the T7 RNA 

1 0 polymerase promoter and a recombinant plasmid or viral vector which has been 

engineered to express a polyprotein that includes NS3 HCV serine protease and the 
secreted human placental alkaline phosphatase (SEAP) gene (Berger et al. 1988) 
under control of the T7 promoter. The present invention was designed to have a 
linkage between the detection of reporter gene activity and NS3 serine protease 

1 5 activity through construction of a segment of the HCV gene encoding the NS2-NS3- 
NS4A-NS4B'-sequence linked to the SEAP reporter. 

Detection of NS3 protease activity is accomplished by having the release and 
hence, the subsequent detection, of the SEAP reporter gene to be dependent upon 
20 NS3 serine protease activity. In a preferred embodiment, the target cell line is first 
infected with a viral vector that expresses the T7 RNA polymerase followed by either 
co-infection with a second viral vector that encodes the NS3 HCV protease/SEAP 
polyprotein, or transfection with a plasmid that contains the same NS3/SEAP gene 
elements. 

25 

The SEAP enzyme is a truncated form of human placental alkaline 
phosphatase, in which the cleavage of the transmembrane domain of the protein 
allows it to be secreted from the cells into the surrounding media. SEAP activity can 
be detected by a variety of methods including, but not limited to, measurement of 

30 catalysis of a fluorescent substrate, immunoprecipitation, HPLC, and radiometric 

detection. The luminescent method is preferred due to its increased sensitivity over 
colorimetric detection methods, and such an assay kit is available from Tropix®. The 
advantages of using SEAP over more routinely used reporter genes such as beta- 
galactosidase or luciferase, is that a cell lysis step is not required since the SEAP 

35 protein is secreted out of the cell. The absence of a cell lysis step decreases intra- 
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and inter-assay variability as well as makes the assay easier to perform then earlier 
assays in the prior art. When both the T7 promoter and NS3/SEAP constructs are 
present SEAP can be detected in the cell medium within the usual viral assay 
timeframe of 24-48 hours, however, the timeframe should not be read as a limitation 
5 because it is theoretically possible to detect the SEAP in the media only a few hours 
after transfection. The medium can then be collected and analyzed . Various 
examples illustrating the use of this composition and method will be detailed below. 

Brief Description of the Drawings 

10 

Figure 1 illustrates schematically the Vaccinia Virus NS3/SEAP System gene 
construct. 

Figure IB illustrates schematically the Plasmid/Vaccinia Virus NS3/SEAP 
assay. 

1 5 Figure 2 illustrates schematically how the assay operates. 

Figure 3 illustrates schematically the DI/DR Assay. 
Figure 4A and 4B shows the SEAP activity dose response curve for a 
representative plasmid/virus assay. 

Figure 5 shows an experimental 96 well plate diagram for the SEAP protocol 
20 on Day 1 in Example 3. 

Figure 6 shows an experimental 96 well plate diagram for the SEAP protocol 
on Day 2 in Example 3. 

Figure 7 shows SEAP activity and Cytotoxicity data for Example 4. 
Figure 8 shows a summary of DI/DR assay data. 
25 Figure 9 illustrates the experimental plate set-up for Example 2. 

Detailed Description of a Preferred Embodiment of the Invention 

The practice of this invention will employ, unless otherwise indicated, 
30 conventional techniques of molecular biology, microbiology, recombinant DNA 

manipulation and production, virology and immunology, which are within the skill of 
the art. Such techniques are explained fully in the literature: Sambrook, Molecular 
Cloning; A Laboratory Manual, Second Edition (1989); DNA Cloning, Volumes I and II 
(D. N. Glover, Ed. 1985); Oligonucleotide Synthesis (M. J. Gait, Ed. 1984); Nucleic 
35 Acid Hybridization (B. D. Hames and S. I. Higgins, Eds. 1984); Transcription and 
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Translation (B. D. Hames and S. I. Higgins, Eds. 1984); Animal Cell Culture (R. I. 
Freshney, Ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A 
Practical Guide to Molecular Cloning (1984); Gene Transfer Vectors for Mammalian 
Cells (J. H. Miller and M. P. Calos, Eds. 1987, Cold Spring Harbor Laboratory); 
5 Methods in Enzymology, Volumes 154 and 155 (Wu and Grossman, and Wu, Eds., 
respectively), (Mayer and Walker, Eds.) (1987); Immunochemical Methods in Cell and 
Molecular Biology (Academic Press, London), Scopes, (1987), Expression of Proteins 
in Mammalian Cells Using Vaccinia Viral Vectors in Current Protocols in Molecular 
Biology, Volume 2 (Frederick M. Ausubel, et al., Eds.)(1991). All patents, patent 
10 applications and publications mentioned herein, both supra and infra, are hereby 
incorporated by reference. 



Both prokaryotic and eukaryotic host cells are useful for expressing desired 
coding sequences when appropriate control sequences compatible with the 

1 5 designated host are used. Among prokaryotic hosts, E. coli is most frequently used. 
Expression control sequences for prokaryotes include promoters, optionally 
containing operator portions, and ribosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid 
containing operons conferring ampicillin and tetracycline resistance, and the various 

20 pUC vectors, which also contain sequences conferring antibiotic resistance markers. 
These plasmids are commercially available. The markers may be used to obtain 
successful transformants by selection. Commonly used prokaryotic control sequences 
include the p-lactamase (penicillinase) and lactose promoter systems (Chang et al, 
Nature (1977) 198:1056), the tryptophan (trp) promoter system (Goeddel et al, Nuc 

25 Acids Res (1980) 8:4057) and the lambda-derived P L promoter and N gene ribosome 
binding site (Shimatake et al, Nature (1981) 292:128) and the hybrid tac promoter (De 
Boer et al, Proc Nat Acad Sci USA (1983) 292:128) derived from sequences of the trp 
and lac UV5 promoters. The foregoing systems are particularly compatible with E. 
coli; if desired, other prokaryotic hosts such as strains of Bacillus or Pseudomonas 

30 may be used, with corresponding control sequences. 



Eukaryotic hosts include without limitation yeast and mammalian ceils in 
culture systems. Yeast expression hosts include Saccharomyces, Klebsiella, Picia, 
and the like. Saccharomyces cerevisiae and Saccharomyces carlsbergensis and K. 
35 lactis are the most commonly used yeast hosts, and are convenient fungal hosts. 
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Yeast-compatible vectors carry markers which permit selection of successful 
transformants by conferring prototrophy to auxotrophic mutants or resistance to heavy 
metals on wild-type strains. Yeast compatible vectors may employ the 2 \i origin of 
replication (Broach et al, Meth Enzymol (1983) 101:307), the combination of CEN3 
5 and ARS1 or other means for assuring replication, such as sequences which will 
result in incorporation of an appropriate fragment into the host cell genome. Control 
sequences for yeast vectors are known in the art and include promoters for the 
synthesis of glycolytic enzymes (Hess et al, J Adv Enzyme Reg (1 968) 7:149; Holland 
et al, Biochem (1978), 17:4900), including the promoter for 3-phosphoglycerate 
10 kinase (R. Hitzeman et al, J Biol Chem (1980) 255:2073). Terminators may also be 
included, such as those derived from the enolase gene (Holland, J Biol Chem (1981) 
256:1385). 

Mammalian cell lines available as hosts for expression are known in the art 
15 and include many immortalized cell lines available from the American Type Culture 
Collection (ATCC), including HeLa cells, Chinese hamster ovary (CHO) cells, baby 
hamster kidney (BHK) cells, BSC 1 cells, CV1 cells, and a number of other cell lines. 
Suitable promoters for mammalian cells are also known in the art and include vital 
promoters such as that from Simian Virus 40 (SV40) (Fiers et al, Nature (1978) 
20 273:1 13), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine papilloma virus 
(BPV). Mammalian cells may also require terminator sequences and poly-A addition 
sequences. Enhancer sequences which increase expression may also be included, 
and sequences which promote amplification of the gene may also be desirable (for 
example methotrexate resistance genes). These sequences are known in the art. 

25 

Vectors suitable for replication in mammalian cells are known in the art, and 
may include vital replicons, or sequences which insure integration of the appropriate 
sequences encoding HCV epitopes into the host genome. For example, another 
vector used to express foreign DNA is Vaccinia virus. In this case the heterologous 

30 DNA is inserted into the Vaccinia genome and transcription can be directed by either 
endogenous vaccinia promoters or exogenous non-vaccinia promoters (e.g. T7 
retroviral promoter) known to those skilled in the art, depending on the characteristics 
of the constructed vector. Techniques for the insertion of foreign DNA into the 
vaccinia virus genome are known in the art, and may utilize, for example, homologous 

35 recombination. The heterologous DNA is generally inserted into a gene which is non- 
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essential to the virus, for example, the thymidine kinase gene (tk), which also provides 
a selectable marker. Plasmid vectors that greatly facilitate the construction of 
recombinant viruses have been described (see, for example, Mackett et al, J Virol 
(1984) 49:857; Chakrabarti et al, Mol Cell Biol (1985) 5:3403; Moss, in GENE 
5 TRANSFER VECTORS FOR MAMMALIAN CELLS (Miller and Calos, eds. f Cold 
Spring Harbor Laboratory, N.Y., 1987), p. 10). Expression of the HCV polypeptide 
then occurs in cells or animals which are infected with the live recombinant vaccinia 
virus. 

10 In order to detect whether or not the HCV polypeptide is expressed from the 

vaccinia vector, BSC 1 cells may be infected with the recombinant vector and grown 
on microscope slides under conditions which allow expression. The cells may then be 
acetone-fixed, and immunofluorescence assays performed using serum which is 
known to contain anti-HCV antibodies to a polypeptide(s) encoded in the region of the 

1 5 HCV genome from which the HCV segment in the recombinant expression vector was 
derived. 

Other systems for expression of eukaryotic or vital genomes include insect 
cells and vectors suitable for use in these cells. These systems are known in the art, 

20 and include, for example, insect expression transfer vectors derived from the 

baculovirus Autographs californica nuclear polyhidrosis virus (AcNPV), which is a 
helper-independent, viral expression vector. Expression vectors derived from this 
system usually use the strong viral polyhedron gene promoter to drive expression of 
heterologous genes. Currently the most commonly used transfer vector for introducing 

25 foreign genes into AcNPV is pAc373 (see PCT WO89/046699 and U.S. Ser. No. 
7/456,637). Many other vectors known to those of skill in the an have also been 
designed for improved expression. These include, for example, pVL985 (which alters 
the polyhedron start codon from ATG to ATT, and introduces a BamHI cloning site 32 
bp downstream from the ATT; See Luckow and Summers, Virol (1989) 17:31). AcNPV 

30 transfer vectors for high level expression of non-fused foreign proteins are described 
in co-pending applications PCT WO89/046699 and U.S. Ser. No. 7/456,637. A unique 
BamHI site is located following position -8 with respect to the translation initiation 
codon ATG of the polyhedron gene. There are no cleavage sites for Smal, Pstl, Bglll, 
Xbal or Sstl. Good expression of non-fused foreign proteins usually requires foreign 

35 genes that ideally have a short leader sequence containing suitable translation 
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initiation signals preceding an ATG start signal. The plasmid also contains the 
polyhedron polyadenylation signal and the ampicillin-resistance (amp) gene and origin 
of replication for selection and propagation in E. coli. 

5 Methods for the introduction of heterologous DNA into the desired site in the 

baculovirus virus are known in the art. (See Summer and Smith, Texas Agricultural 
Experiment Station Bulletin No. 1555; Smith et al, Mol. Cell Biol. (1983) 3:2156-2165; 
and Luckow and Summers, Virol. (1989) 17:31). For example, the heterologous DNA 
can be inserted into a gene such as the polyhedron gene by homologous 

10 recombination, or into a restriction enzyme site engineered into the desired 

baculovirus gene. The inserted sequences may be those which encode all or varying 
segments of the polyprotein, or other orfs which encode viral polypeptides. For 
example, the insert could encode the following numbers of amino acid segments from 
the polyprotein: amino acids 1-1078; amino acids 332-662; amino acids 406-662; 

15 amino acids 156-328, and amino acids 199-328. 

The signals for post-translational modifications, such as signal peptide 
cleavage, proteolytic cleavage, and phosphorylation, appear to be recognized by 
insect cells. The signals required for secretion and nuclear accumulation also appear 
20 to be conserved between the invertebrate cells and vertebrate cells. Examples of the 
signal sequences from vertebrate cells which are effective in invertebrate cells are 
known in the art, for example, the human interleukin-2 signal (IL2 S ) which signals for 
secretion from the cell, is recognized and properly removed in insect cells. 

25 Transformation may be by any known method for introducing polynucleotides 

into a host cell, including, for example packaging the polynucleotide in a virus and 
transducing a host cell with the virus, and by direct uptake of the polynucleotide. The 
transformation procedure used depends upon the host to be transformed. Bacterial 
transformation by direct uptake generally employs treatment with calcium or rubidium 

30 chloride (Cohen, Proa Nat. Acad. Sci. USA (1972) 69:21 10; T. Maniatis et at, 

"Molecular Cloning; A Laboratory Manual" (Cold Spring Harbor Press, Cold Spring 
Harbor, N.Y., 1982). Yeast transformation by direct uptake may be carried out using 
the method of Hinnen et al, Proc. Nat. Acad. Sci. USA (1978) 75:1929. Mammalian 
transformations by direct uptake may be conducted using the calcium phosphate 

35 precipitation method of Graham and Van der Eb, Virol. (1978) 52:546, or the various 
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known modifications thereof. Other methods for introducing recombinant 
polynucleotides into cells, particularly into mammalian cells, include dextran-mediated 
transfection, calcium phosphate mediated transfection, polybrene mediated 
transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) 
5 in liposomes, and direct microinjection of the polynucleotides into nuclei. 

Vector construction employs techniques which are known in the art. Site- 
specific DNA cleavage is performed by treating with suitable restriction enzymes 
under conditions which generally are specified by the manufacturer of these 

10 commercially available enzymes. In general, about 1 mg of plasmid or DNA sequence 
is cleaved by 1 unit of enzyme in about 20 mL buffer solution by incubation for 1-2 hr 
at 37° C. After incubation with the restriction enzyme, protein is removed by 
phenol/chloroform extraction and the DNA recovered by precipitation with ethanol. 
The cleaved fragments may be separated using polyacrylamide or agarose gel 

15 electrophoresis techniques, according to the general procedures described in Meth. 
Enzymol. (1980) 65:499-560. 



Sticky-ended cleavage fragments may be blunt ended using E. coli DNA 
polymerase I (Klenow fragment) with the appropriate deoxynucleotide triphosphates 
20 (dNTPs) present in the mixture. Treatment with S1 nuclease may also be used, 
resulting in the hydrolysis of any single stranded DNA portions. 



Ligations are carried out under standard buffer and temperature conditions 
using T4 DNA ligase and ATP; sticky end ligations require less ATP and less ligase 

25 than blunt end ligations. When vector fragments are used as part of a ligation mixture, 
the vector fragment is often treated with bacterial alkaline phosphatase (BAP) or calf 
intestinal alkaline phosphatase to remove the S'-phosphate, thus preventing re- 
ligation of the vector. Alternatively, restriction enzyme digestion of unwanted 
fragments can be used to prevent ligation. Ligation mixtures are transformed into 

30 suitable cloning hosts, such as E. co//, and successful transformants selected using 
the markers incorporated (e.g., antibiotic resistance), and screened for the correct 
construction. 



Synthetic oligonucleotides may be prepared using an automated 
35 oligonucleotide synthesizer as described by Warner, DNA (1984) 3:401. If desired, the 
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synthetic strands may be labeled with 32 P by treatment with polynucleotide kinase in 
the presence of 32 P-ATP under standard reaction conditions. 

DNA sequences, including those isolated from cDNA libraries, may be 
5 modified by known techniques, for example by site directed mutagenesis (see e.g., 
Zoller, Nuc. Acids Res. (1982) 10:6487). Briefly, the DNA to be modified is packaged 
into phage as a single stranded sequence, and converted to a double stranded DNA 
with DNA polymerase, using as a primer a synthetic oligonucleotide complementary to 
the portion of the DNA to be modified, where the desired modification is included in 

10 the primer sequence. The resulting double stranded DNA is transformed into a phage- 
supporting host bacterium. Cultures of the transformed bacteria which contain copies 
of each strand of the phage are plated in agar to obtain plaques. Theoretically, 50% of 
the new plaques contain phage having the mutated sequence, and the remaining 50% 
have the original sequence. Replicates of the plaques are hybridized to labeled 

1 5 synthetic probe at temperatures and conditions which permit hybridization with the 
correct strand, but not with the unmodified sequence. The sequences which have 
been identified by hybridization are recovered and cloned. 

DNA libraries may be probed using the procedure of Grunstein and Hogness 
20 Proc. Nat Acad. Sci. USA (1975) 73:3961. Briefly, in this procedure the DNA to be 
probed is immobilized on nitrocellulose filters, denatured, and pre-hybridized with a 
buffer containing 0-50% formamide, 0.75M NaCI, 75 mM Na citrate, 0.02% (wt/v) 
each of bovine serum albumin, polyvinylpyrrolidone, and Ficoll®, 50 mM NaH2PC>4 
(pH 6.5), 0.1% SDS, and 100 m g/mL carrier denatured DNA. The percentage of 
25 formamide in the buffer, as well as the time and temperature conditions of the pre- 
hybridization and subsequent hybridization steps depend on the stringency required. 
Oligomeric probes which require lower stringency conditions are generally used with 
low percentages of formamide, lower temperatures, and longer hybridization times. 
Probes containing more than 30 or 40 nucleotides, such as those derived from cDNA 
30 or genomic sequences generally employ higher temperatures, e.g., about 40°-42° C, 
and a high percentage formamide, e.g., 50%. Following pre-hybridization, 5'- 32 P- 
labeled oligonucleotide probe is added to the buffer, and the filters are incubated in 
this mixture under hybridization conditions. After washing, the treated filters are 
subjected to autoradiography to show the location of the hybridized probe; DNA in 
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corresponding locations on the original agar plates is used as the source of the 
desired DNA. 

For routine vector constructions, ligation mixtures are transformed into E. coli 
5 strain HB101 or other suitable hosts, and successful transformants selected by 
antibiotic resistance or other markers. Plasmids from the transformants are then 
prepared according to the method of Clewell et al, Proc. Nat Acad. Sci. USA (1969) 
62:1159, usually following chloramphenicol amplification (Clewell, J. Bacteriol. (1972) 
1 10:667). The DNA is isolated and analyzed, usually by restriction enzyme analysis 

10 and/or sequencing. Sequencing may be performed by the dideoxy method of Sanger 
et at, Proc. Nat Acad. Sci. USA (1977) 74:5463, as further described by Messing et 
at, Nuc. Acids Res. (1981) 9:309, or by the method of Maxam et at, Meth. Enzymol. 
(1980) 65:499. Problems with band compression, which are sometimes observed in 
GC-rich regions, were overcome by use of T-deazoguanosine according to Barr et al, 

1 5 Biotechniques (1 986) 4:428. 

Target plasmid sequences are replicated by a polymerizing means which 
utilizes a primer oligonucleotide to initiate the synthesis of the replicate chain. The 
primers are selected so that they are complementary to sequences of the plasmid. 
20 Oligomeric primers which are complementary to regions of the sense and antisense 
strands of the plasmids can be designed from the plasmid sequences already known 
in the literature. 

The primers are selected so that their relative positions along a duplex 
25 sequence are such that an extension product synthesized from one primer, when it is 
separated from its template (complement), serves as a template for the extension of 
the other primer to yield a replicate chain of defined length. 

The primer is preferably single stranded for maximum efficiency in 
30 amplification, but may alternatively be double stranded. If double stranded, the primer 
is first treated to separate its strands before being used to prepare extension 
products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be 
sufficiently long to prime the synthesis of extension products in the presence of the 
agent for polymerization. The exact lengths of the primers will depend on many 
35 factors, including temperature and source of the primer and use of the method. For 
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example, depending on the complexity of the target sequence, the oligonucleotide 
primer typically contains about 15—45 nucleotides, although it may contain more or 
fewer nucleotides. Short primer molecules generally require cooler temperatures to 
form sufficiently stable hybrid complexes with the template. 

5 

The primers used herein are selected to be "substantially" complementary to 
the different strands of each specific sequence to be amplified. Therefore, the primers 
need not reflect the exact sequence of the template, but must be sufficiently 
complementary to selectively hybridize with their respective strands. For example, a 

1 0 non-complementary nucleotide fragment may be attached to the 5'-end of the primer, 
with the remainder of the primer sequence being complementary to the strand. 
Alternatively, non-complementary bases or longer sequences can be interspersed into 
the primer, provided that the primer has sufficient complementarity with the sequence 
of one of the strands to be amplified to hybridize therewith, and to thereby form a 

1 5 duplex structure which can be extended by the polymerizing means. The non- 
complementary nucleotide sequences of the primers may include restriction enzyme 
sites. Appending a restriction enzyme site to the end(s) of the target sequence would 
be particularly helpful for cloning of the target sequence. 

20 It will be understood that "primer", as used herein, may refer to more than one 

primer, particularly in the case where there is some ambiguity in the information 
regarding the terminal sequence(s) of the target region to be amplified. Hence, a 
"primer" includes a collection of primer oligonucleotides containing sequences 
representing the possible variations in the sequence or includes nucleotides which 

25 allow a typical basepairing. 

The oligonucleotide primers may be prepared by any suitable method. 
Methods for preparing oligonucleotides of specific sequence are known in the art, and 
include, for example, cloning and restriction of appropriate sequences, and direct 

30 chemical synthesis. Chemical synthesis methods may include, for example, the 
phosphotriester method described by Narang et al. (1979), the phosphodiester 
method disclosed by Brown et al. (1979), the diethylphosphoramidate method 
disclosed in Beaucage et al. (1981), and the solid support method in U.S. Pat. No. 
4,458,066. The primers may be labeled, if desired, by incorporating means 

35 detectable by spectroscopic, photochemical, biochemical, immunochemical, or 
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chemical means. 

Template-dependent extension of the oligonucleotide primer(s) is catalyzed by 
a polymerizing agent in the presence of adequate amounts of the four 
5 deoxyribonucleotide triphosphates (dATP, dGTP, dCTP and dTTP) or analogs, in a 
reaction medium which is comprised of the appropriate salts, metal cations, and pH 
buffering system. Suitable polymerizing agents are enzymes known to catalyze 
primer- and template-dependent DNA synthesis. Known DNA polymerases include, 
for example, E. coli DNA polymerase I or its Klenow fragment, T4 DNA polymerase, 
10 and Taq DNA polymerase. The reaction conditions for catalyzing DNA synthesis with 
these DNA polymerases are known in the art. 

The products of the synthesis are duplex molecules consisting of the template 
strands and the primer extension strands, which include the target sequence. These 

1 5 products, in turn, serve as template for another round of replication. In the second 
round of replication, the primer extension strand of the first cycle is annealed with its 
complementary primer; synthesis yields a "short" product which is bounded on both 
the 5'- and the 3'-ends by primer sequences or their complements. Repeated cycles 
of denaturation, primer annealing, and extension result in the exponential 

20 accumulation of the target region defined by the primers. Sufficient cycles are run to 
achieve the desired amount of polynucleotide containing the target region of nucleic 
acid. The desired amount may vary, and is determined by the function which the 
product polynucleotide is to serve. 

25 The PCR method can be performed in a number of temporal sequences. For 

example, it can be performed step-wise, where after each step new reagents are 
added, or in a fashion where all of the reagents are added simultaneously, or in a 
partial step-wise fashion, where fresh reagents are added after a given number of 
steps. 

30 

In a preferred method, the PCR reaction is carried out as an automated 
process which utilizes a thermostable enzyme. In this process the reaction mixture is 
cycled through a denaturing region, a primer annealing region, and a reaction region. 
A machine may be employed which is specifically adapted for use with a thermostable 
35 enzyme, which utilizes temperature cycling without a liquid handling system, since the 
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enzyme need not be added at every cycle. This type of machine is commercially 
available from Perkin Elmer Cetus Corp. 

After amplification by PCR, the target polynucleotides are detected by 
5 hybridization with a probe polynucleotide which forms a stable hybrid with that of the 
target sequence under stringent to moderately stringent hybridization and wash 
conditions. If it is expected that the probes will be completely complementary (i.e., 
about 99% or greater) to the target sequence, stringent conditions will be used. If 
some mismatching is expected, for example if variant strains are expected with the 

10 result that the probe will not be completely complementary, the stringency of 
hybridization may be lessened. However, conditions are chosen which rule out 
nonspecific/adventitious binding. Conditions which affect hybridization, and which 
select against nonspecific binding are known in the art, and are described in, for 
example, Maniatis et al. (1982). Generally, lower salt concentration and higher 

1 5 temperature increase the stringency of binding. For example, it is usually considered 
that stringent conditions are incubation in solutions which contain approximately 
0.1*SSC, 0.1% SDS, at about 65° C. incubation/wash temperature, and moderately 
stringent conditions are incubation in solutions which contain approximately 1- 
2*SSC, 0.1% SDS and about 50°-65° C. incubation/wash temperature. Low 

20 stringency conditions are 2*SSC and about 30°-50°C. 

Probes for plasmid target sequences may be derived from well known 
restriction sites. The plasmid probes may be of any suitable length which span the 
target region, but which exclude the primers, and which allow specific hybridization to 

25 the target region. If there is to be complete complementarity, i.e., if the strain contains 
a sequence identical to that of the probe, since the duplex will be relatively stable 
under even stringent conditions, the probes may be short, i.e., in the range of about 
10-30 base pairs. If some degree of mismatch is expected with the probe, i.e., if it is 
suspected that the probe will hybridize to a variant region, the probe may be of 

30 greater length, since length seems to counterbalance some of the effect of the 
mismatch(es). 

The probe nucleic acid having a sequence complementary to the target 
sequence may be synthesized using similar techniques described supra, for the 
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synthesis of primer sequences. If desired, the probe may be labeled. Appropriate 
labels are described supra. 

In some cases, it may be desirable to determine the length of the PCR product 
5 detected by the probe. This may be particularly true if it is suspected that variant 
plasmid products may contain deletions within the target region, or if one wishes to 
confirm the length of the PCR product. In such cases it is preferable to subject the 
products to size analysis as well as hybridization with the probe. Methods for 
determining the size of nucleic acids are known in the art, and include, for example, 
10 gel electrophoresis, sedimentation in gradients, and gel exclusion chromatography. 

The presence of the target sequence in a biological sample is detected by 
determining whether a hybrid has been formed between the polynucleotide probe and 
the nucleic acid subjected to the PCR amplification technique. Methods to detect 

1 5 hybrids formed between a probe and a nucleic acid sequence are known in the art. 
For example, for convenience, an unlabeled sample may be transferred to a solid 
matrix to which it binds, and the bound sample subjected to conditions which allow 
specific hybridization with a labeled probe; the solid matrix is than examined for the 
presence of the labeled probe. Alternatively, if the sample is labeled, the unlabeled 

20 probe is bound to the matrix, and after the exposure to the appropriate hybridization 
conditions, the matrix is examined for the presence of label. Other suitable 
hybridization assays are described supra. Analysis of the nucleotide sequence of the 
target region(s) may be by direct analysis of the PCR amplified products. A process 
for direct sequence analysis of PCR amplified products is described in Saiki et al. 

25 (1988). 

Alternatively, the amplified target sequence(s) may be cloned prior to 
sequence analysis. A method for the direct cloning and sequence analysis of 
enzymatically amplified genomic segments has been described by Scharf (1986). In 
30 the method, the primers used in the PCR technique are modified near their 5 '-ends to 
produce convenient restriction sites for cloning directly into, for example, an M13 
sequencing vector. After amplification, the PCR products are cleaved with the 
appropriate restriction enzymes. The restriction fragments are iigated into the M13 
vector, and transformed into, for example, a JM 103 host, plated out, and the resulting 
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plaques are screened by hybridization with a labeled oligonucleotide probe. Other 
methods for cloning and sequence analysis are known in the art. 

Construction of the HCV/SEAP reporter gene plasmid 

5 

General Method 

In the first embodiment, the Tropix® pCMV/SEAP expression vector is used as 
a starting point for construction of the HCV NS3 protease plasmid construct pHCAPI 

1 0 (Seq. ID. NOS. 1-7). pHCAPI is constructed from the pTM3 vector (Moss et al., 

Nature, 348:91-92 (1990)) in which the nucleotide sequence encoding the portion of 
the HCV-BK polyprotein domains NS2-NS3-NS4A-NS4B was cloned from the 
pBKCMV/NS2-NS3-NS4A-NS4B-SEAP (the pBK/HCAP) construct. pBK/HCAP is the 
eukaryotic expression plasmid in which all the original subcloning and ligation of all 

1 5 the HCV NS gene fragments and SEAP gene was created in. pCMV/SEAP is a 

mammalian expression vector designed for studies of promoter/enhancer elements 
with SEAP as a reporter ( Berger et al.. (1988 )). The vector contains a polylinker for 
promoter/enhancer insertion, as well as an intron and polyadenylation signals from 
SV40. The vector can be propagated in E.coli due to the pUC19 derived origin of 

20 replication and ampicillin resistance gene. Modification of the commercially available 
plasmids is accomplished by use of PCR techniques including mutational PCR. 
Although this particular plasmid is described in the examples that follow, it is not the 
only plasmid or vector which may be used. The T7 RNA polymerase promoter is part 
of the pTM3 plasmid which was preferred in construction of the pHCAP vector. 

25 

In an alternate embodiment, the pTKgptF2s plasmid (Falkner and Moss, J. 
Virol. 62:1849-1854 (1988)) can be used instead of the pTM3 plasmid, which places 
the HCV/SEAP gene construct under transcriptional control of the native vaccinia 
virus promoter. The only requirement is that the promoter operate when placed in a 
30 plasmid having vaccinia virus regions flanking the subcloning region. This requirement 
allows the plasmid homologous recombination with the wild type vaccinia virus. Other 
vaccinia virus intermediate plasmids would be operable here as well. 

Example 1 

35 The Tropix® pCMV/SEAP expression vector is first modified so that both Sad 
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restriction sites are inactivated. This is done by cleaving the plasmid with BamH1 
which results in a 5' cleavage product that contains the plasmid 5' ATG site and about 
250 bp ending at the Bam H1 site, and a 3' cleavage product having BamH1 sites at 
its 5' end and at its 3' end. The 5' cleavage fragment was then amplified from the 
5 pCMV/SEAP plasmid using primers that were designed to delete the 5' ATG codon 
and to create a Sac 1 site on the 5' end. The downstream 3' primer spanned the Bam 
H1 site that is present within the SEAP coding sequence. Thus after PCR, the 
amplified 5' fragment has a 5' Sac 1 site and a Bam H1 site. The 5' primer introduced 
an extra codon (a glutamic acid residue) in front of the first leucine residue of the 
10 SEAP secretion signal. Furthermore, the first leucine codon was changed from a 
CTG to a CTC codon (a silent change). The codon change was made to create the 
second half of the Sac 1 site: 

5'-GAGCTC-X-GGATCC-3' (Seq. ID NO:22) 
15 Sac 1 site 5' end of SEAP Bam H1 

The modified sequence is then cloned into pGEM3Zf(+) (Promega). The Bam 
H1-Bam H1 SEAP fragment was subcloned into pAlter-1 (Promega) which is a 
plasmid that has an f1 origin of replication so it produces a single strand DNA for use 

20 in oligo mediated site directed mutagenesis. The Sac 1 sites within the SEAP 

fragment were mutated by oligo mediated site directed mutagenesis (GAG CTC to 
GAGCTG - a silent change) and the same change at the second Sac 1 site 
(GAG CTC to GAGCTG - an amino acid change from Serine to Cysteine) The 
complete SEAP pGEM3Zf(+) plasmid is then made by subcloning the PCR modified 

25 5* SEAP fragment into the Sac I- Bam H1 sites of pGEM3Zf(+). The resulting plasmid 
was then linearized with Bam H1 to allow the subcloning of the 3' SEAP Bam H1-Bam 
H1 from the pAlter-1 plasmid which was used for the oligo mediated site directed 
mutagenesis to disrupt the two internal Sac I sites. A clone with the correct 
orientation of the Bam H1- Bam H1 fragment distal to the 5' SEAP fragment was 

30 selected after of purified plasmid DNA by restriction enzyme digest. This clone was 
used in the subsequent subcloning steps for the construction of the HCV/SEAP 
construct. 

The coding sequences for the HCV proteins and NS3 cleavage sites that 
35 comprise the final HCV/SEAP polyprotein were generated in two separate PCRs from 
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cDNA of the HCV-BK strain (Accession No. M58335). Takamizawa, A., et al., J. Virol. 
65: 1 1 05-1 1 1 3 1 991 . The first amplified fragment starts with the amino acid coding 
sequence of the HCV polyprotein corresponding to the C-terminal 81 amino acids of 
the putative E2 region, which are upstream of the beginning of the putative NS2 
5 region or amino acid 729 

{ARVCACLWMMLLIAQAEAALENLWLNSASVAGAHGILSFLVFFCAAWYIKGRLVPG 
ATYALYGVWPLLLLLLALPPRAYAMDREMAA) (Seq. ID NO:23) 

10 or nucleotide 21 87 

(GCACGTGTCTGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCCGAGGC 
CGCCTTGGAGAACCTGGTGGTCCTCAATGCGGCGTCTGTGGCCGGCGCACATG 
GCATCCTCTCCTTCCTTGTGTTCTTCTGTGCCGCCTGGTACATCAAAGGCAGGCT 
1 5 GGTCCCTGGG G CGG C ATATG CTCTTTATGG CGTGTG G CCG CTG CTCCTGCTCTT 
GCTGGCATTACCACCGCGAGCTTACGCCATGGACCGGGAGATGGC) (Seq. ID 
NO:24) 

and contains the DNA encoding the HCV polyprotein domains NS2-NS3-NS4A 
20 through the first 176 amino acids of the NS4B gene 

(CASHLPYIEQ GMQLAEQFKQ KALGLLQTAT KQAEAAAPVV ESKWRALETF 
WAKHMWNFIS GIQYLAGLST LPGNPAIASL MAFTASITSPLTTQSTLLFN 
ILGGWVAAQL APPSAASAFV GAGIAGAAVG SIGLGKVLVD 
25 ILAGYGAGVAGALVAFKVMS GEMPSTEDLV NLLPAIL) (Seq. ID NO:25) 

or amino acid 1886 or nucleotide 5658 

(TGCGCCTCGCACCTCCCTTACATCGAGCAGGGAATGCAGCTCGCCGAGCAATT 
30 CAAGCAGAAAGCGCTCGGGTTACTGCAAACAGCCACCAAACAAGCGGAGGCTG 
CTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCTTGAGACATTCTGGGCGAAG 
CACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAGGCTTATCCACTCTGC 
CTGGGAACCCCGCAATAGCATCATTGATGGCATTCACAGCCTCTATCACCAGCC 
CGCTCACCACCCAAAGTACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTG 
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CCCAACTCGCCCCCCCCAGCGCCGCTTCGGCTTTCGTGGGCGCCGGCATCGCC 
GGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGGTGCTTGTGGACATTCTGGC 
GGGTTATGGAGCAGGAGTGGCCGGCGCGCTCGTGGCCTTTAAGGTCATGAGCG 
GCGAGATGCCCTCCACCGAGGACCTGGTCAATCTACTTCCTGCCATC) (Seq. ID 
5 NO:26) 

The primers used to amplify the fragment were designed to contain an Eco Rl site and 
an ATG codon in the 5' primer (Seq. ID NO:27) and an Xho I site in the 3' primer 
(Seq. ID NO:28). The amplified fragment was accordingly subcloned as an Eco Rl - 

1 0 Xho I fragment into pET24a(+) plasmid (Novagen). The second fragment amplified 
from the HCV strain BK cDNA encompasses the putative NS5A/5B cleavage site 
(EEASEDWCCSMSYTWTGAL)(Seq. ID NO:29). The 5' primer that was used to 
amplify the cleavage site was designed to have an Xho I site (Seq. ID NO:30) 
whereas the 3' primer was designed to have a Sac I site (Seq. ID NO:31). The 

1 5 resulting PCR product was subcloned as an Xho I - Sac I fragment into pET24a(+), 
which had been digested with Xho I- Hind III, as part of a three way ligation (Seq. ID 
NO:32). The third fragment in the three way ligation was the Sac I - Hind III fragment 
from the SEAP pGEM3Zf(+) plasmid. The Sac I - Hind III fragment encompassed the 
modified SEAP gene and also 30 base pairs of the pGEM3Zf(+) polylinker which 

20 included the multiple cloning sites (MCS) between the Bam H1 and Hindlll sites. The 
final HCV/SEAP construct was assembled using pBKCMV as the vector. pBKCMV 
was digested with Eco Rl and Hind III and then used in a three way ligation with the 
NS5A/5B - SEAP Xho I -Hind III fragment and the Eco Rl-Xho I NS2-NS4B fragment. 

25 The control plasmids for the assay (pHCAP3, pHCAP4) were constructed in a 

similar manner to the HCV/SEAP construct. The control plasmids have either an 
inactive form of NS3 protease or inactive forms of both NS2 protease and NS3 
protease. Inactivation of NS2 and NS3 proteases was accomplished by oiigo 
mediated site directed mutagenesis performed on the PCR amplified NS2 - NS4B 

30 fragment that had been subcloned into pALTER-1 as an Eco R1 - Xho 1 fragment 
together with the NS5A/5B Xho 1 - Sac 1 fragment. In order to inactivate the NS3 
protease, the catalytic serine residue was substituted with an alanine by replacing 
thymidine (ICG) with guanine (GCG)(base 2754). The NS2 protease was inactivated 
by substitution of the catalytic cysteine residue with an alanine residue (TGT -> 

35 £3CT)(bases 2238-2239). The resulting inactivated NS3 protease and inactivated 
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NS2-NS3 proteases variants of the NS2-NS4B fragment were each subcloned into 
pBKCMV as separate Eco R1 - Xho 1 fragments together with the NS5A/5B - SEAP 
Xho 1 - Hind III fragment. 

5 The pHCAPI (NS2 WT NS3 ^(Seq. ID NOS:1-7), pHCAP3 (NS2 WT NS3 

MUT )(Seq. ID NOS:8-14), and pHCAP4 (NS2 MUT NS3 MUT ) (Seq. ID NOS: 15-21) 
plasmids were constructed using pTM3 as the vector and the appropriate HCV/SEAP 
fragment from the corresponding pBKHCV/SEAP constructs. The pBKHCV/SEAP 
constructs were first digested with Eco R1 and the Eco R1 site was filled in using 

10 Klenow fragment in a standard fill in reaction. The pBKHCV/SEAP constructs were 
then digested with Xba I and the gel purified HCV/SEAP fragment was subcloned into 
pTM3 that had been digested with Sma 1 and Spe 1. Subcloning the HCV/SEAP 
fragment into the Sma I site will result in an additional 6 amino acids (MGIPQF) (Seq. 
ID NO:33) at the N-terminus (codons 1426-1444) if the preferred translational start 

15 codon, which is part of the Nco 1 site in pTM3, is used. 

The pHCAPI (NS2 WT NS3 pHCAP3 (NS2 WT NS3 MUT ), and pHCAP4 
(NS2 MUT NS3 MUT ) plasmids have been used to generate recombinant vaccinia viruses 
as described in the next section. 

20 

Construction of the HCV/SEAP reporter gene viral vectors 

Applicants have generated recombinant vaccinia virus using pHCAPI and the 
control plasmids, pHCAP3 and pHCAP4. Recombinant vaccinia viruses were 

25 generated using standard procedures in which BSC-1 cells were infected with wild 
type vaccinia virus (strain WR from ATCC) and then transfected with either pHCAPI, 
pHCAP3, or pHCAP4. Selection of recombinant virus was performed by growth of 
infected transfected cells in the presence of mycophenolic acid. The recombinant 
vaccinia viruses are termed vHCAPI, VHCAP3, and vHCAP4 and correspond directly 

30 with the pHCAPI, pHCAP3, and pHCAP4 plasmids. Large scale stocks of the 
vHCAPI, vHCAP3, and vHCAP4 were grown and titered in CV1 cells. 

Transfection of Cell Lines Containing the HCV/SEAP reporter 
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In the first embodiment HeLa cells are transfected with the Hep C/SEAP 
reporter gene plasmid, pHCAPt, and co-infection with a vTF7.3, a recombinant 
vaccinia virus (Fuerst et al., Proc. Nat Acad. ScL USA, 86:8122-8126 (1986)). 
vTF7.3 expresses T7 RNA polymerase which is required for transcription of the 
5 reporter gene since it is under the control of T7 promoter in the pTM3 plasmid. The 
pTM3 plasmid is a vaccinia intermediate plasmid which can function as an expression 
vector in cells when T7 RNA polymerase is provided in trans (Figure 2). 

As described previously, the Hep C/SEAP reporter gene encodes for a 
10 polyprotein with the following gene order: HCV (strain BK) NS2-NS3-NS4A-NS4B' - 
NS5A/5B cleavage site - SEAP. Thus the HCV sequences for the amino acid coding 
sequence of the HCV polyprotein corresponding to the C-terminal 81 amino acids of 
the putative E2 region, which are upstream of the start of the putative NS2 region (as 
defined by Grakoui et al. ) or amino acid 729 and continues through the first 176 
15 amino acids of the NS4B gene or amino acid 1886 (Seq. ID NOS:23-26), and is 

proximal to the SEAP protein (see Figure 1). The NS5A/5B cleavage site has been 
engineered between the end of NS4B' and the second codon of SEAP. 



The working theory behind the unique design of the reporter gene construct is 
20 that the SEAP polyprotein is tethered, as part of the NS2-NS3-NS4A-NS4B' - 

NS5A/5B cleavage site - SEAP polyprotein, inside the cell. It has been shown that 
NS2 is a hydrophobic protein and is associated with the outside of the endoplasmic 
reticulum (ER). Grakoui, et al. (1993). Thus, in the present invention, SEAP is 
tethered to the ER via the action of NS2. Release of SEAP from the polyprotein 
25 tether will occur upon NS3-mediated cleavage at the NS5A/5B cleavage site. SEAP 
is then secreted from the cell and can be monitored by assaying media for alkaline 
phosphatase activity (Figure 1 B). It is assumed that it is NS3-mediated cleavage at 
the NS5A/5B site which is the necessary cleavage to release SEAP from the 
upstream polyprotein sequences. However NS3-mediated cleavage at other sites 
30 within the polyprotein may be responsible for SEAP release and hence its subsequent 
secretion. Both NS3 and NS3/NS4A, where NS4A is a cofactor for NS3, can mediate 
cleavage at the NS3/4A and NS4A/4B cleavage sites which are present in polyprotein 
in addition to the engineered NS5A/5B cleavage site. Thus there may be more than 
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one NS3-mediated cleavage event occurring over the length of the polyprotein before 
SEAP is available to the cell secretion apparatus and secreted from the cell. Further, 
in an alternative embodiments the tether may be changed depending upon the 
chosen cleavage site. In addition, NS2 is an autocatalytic protease; it mediates the 
5 cleavage event between it's carboxy-terminal end and the NS3 N-terminus. In the 
Hep C/SEAP polyprotein, NS2-mediated cleavage at the NS2/NS3 site would release 
the NS3-NS4A-NS4B-SEAP polyprotein from the ER. 

The above described system can be used to evaluate potent NS3 inhibitors by 
1 0 monitoring the effect of increasing drug concentration on SEAP activity. NS3 

inhibition would be detected as a decrease in SEAP activity. Recognizing that a 
decrease in SEAP activity could also be due to cell cytotoxicity of a given compound 
or a non-specific effect on vaccinia virus which would adversely effect SEAP 
transcription, appropriate controls are used as discussed below. 

15 

In an alternate embodiment, a "cis-only" cleavage assay is contemplated. In 
this assay the NS2 MUT NS3 1 variant of the HCV/SEAP (HCAP2) is used so the 
polyprotein remains tethered to the outside of the endoplasmic reticulum because the 
NS2 protease cannot catalyze the cleavage between the C-terminus and the NS3 N- 
20 terminus. Thus the only way for SEAP to be released from the tether is if the NS3 
protease clips in cis at the NS5A/5B cleavage site. There should not be any trans 
NS3 mediated cleavage events occurring since NS2 is not available to release the 
NS3 N-terminus from its tether. The control plasmid or virus for this assay is the 
NS2 MUT NS3 MUT variant HCAP4. 

25 

DI/DR Assay 

A preferred embodiment involves the co-infection of BHK (ATCC No. CCL-10) 
or CV1 cells (a COS1 derived line ATCC No. CCL-70) cells with both vHCAPI and 

30 vTF7.3 (ATCC No, VR-2153), with CV1 being more preferred. The latter virus is 
necessary since the Hep C/SEAP gene remains under control of the T7 RNA 
polymerase promoter in the vHCAP recombinant viruses. Currently both 
embodiments which are termed the Hep C/SEAP transfection/infection assay, and the 
dual recombinant vaccinia virus assay (DI/DR assay) respectively, are useful for HCV 

35 protease candidate compound evaluation (Figure 3). 
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Example 1 

Protocol for vTF7.3 infection / HCV/SEAP Plasmid Transfection Experiment 
5 Day 1 

Flat-bottom 96 well plates were seeded with BHK cells at a density of 1 x 10 4 
cells/well (equivalent to about 85% confluence) after 24 hours. In general, one 96 
well plate was used for investigation of each compound of interest (protease inhibitor), 
plus an additional plate at the same cell density is used where two rows are 
1 0 designated for each compound of interest at increasing concentrations for 

investigating the cytotoxicity of the compounds themselves in cells alone. Cytotoxicity 
was determined by XTT assay (Sigma 4626). 

Day 2 

1 5 The established monolayer was transfected with either pHCAPI , pHCAP3, 

pHCAP4, or pTM3 plasmids at a concentration of 0.4 pg/well as part of a DNA 
Lipofectamine (Gibco BRL) transfection mixture. Infections of the established 
monolayer with vTF7.3 preceded the transfection step. A working stock of vTF7.3 
was diluted to a multiplicity of infection (MOI) of 10 with Optimem. The media was 

20 aspirated from the wells (2B-1 0G) 2 rows at a time. A 50 L aliquot of vTF7.3 

inoculum was added per well and gently shaken every 10 minutes. 30 minutes after 
inoculum addition, the transfection mixes were made by adding 1 mL of Optimem in 3 
ml_ polystyrene tubes. To the media, 48 ng of plasmid DNA was then added to the 
tubes and mixed, followed by 144 )iL of Lipofectamine™, and then the mixture was 

25 incubated (R.T.) for 30 minutes. After incubation, 11 mL of Optimem were added to 
each of the tubes and gently mixed. The vTF7.3 inoculum was aspirated from the 
wells and 0.1 mL of transfection mix was added to each well and incubated at 34 °C 
for 4 hours. Compounds/drugs of interest for testing protease inhibition were 
prepared as stock solutions of 40 mM in 100% DMSO. For assay use, the 

30 compounds were diluted to 640 jllM (2X) in Optimem with 4% FBS. The compound 
dilutions were set up in an unused 96 well plate by adding 100 jllL Optimem with 4% 
FBS to wells 4-1 0 and 1 50 of compound dilutions to all wells in column 3. A serial 
dilution of the compounds was then performed by transferring 46 jaL from well to well 
across the plate. The transfection mixture was then aspirated from the cells. Then 75 
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\xL of Optimem with 4% FBS was added to the transfected monolayers. Add 75 jliL of 
the 2X compound dilutions to the transfected monolayers and incubated at 34 °C for 
48 hours. The cells were checked microscopically at 24 hours and media is collected 
at 48 hours for measurement of SEAP activity. 

5 

SEAP Activity Measurement 

After 48 hours, SEAP activity was measured by first transferring 100 pi of 
media from each well of the 96 well assay plate to a new sterile 96 well plate. Plate(s) 

10 were sealed and heated in a heating block at 65 C for 30 minutes. After 30 minutes, 
plate(s) were removed and cooled to room temperature. For each heat treated plate, 
we transferred 50 pi of heat treated media to a Dynex (Dynex 7416) 96 well plate. To 
each well was added 50 pi of Tropix assay buffer and incubated at room temperature 
for 5 minutes, followed by an addition to each well of 50 pi of Tropix reaction 

1 5 buffer/CSPD substrate (Tropix), each was mixed, and incubated for an additional 90 
minutes at room temperature. Chemiluminescence was read in the Victor multilabel 
counter from Wallac, Inc. (model number 1420) as one second counts and data is 
reported as luminescent units/second. 

20 For Examples 1 and 2: 

XTT Cytotoxicity Assay 

XTT (Sigma 4626) was dissolved in phosphate buffered saline (PBS) to a final 
25 concentration of 1 mg/mL 5 ml_ was prepared per plate. To this solution was added 5 
mM PMS (n-methyldibenzopyrazine methyl sulfate salt) (Sigma P9625) to a final 
concentration of 20 pM. 50 pL of the XTT solution was added per well to the plate set up 
for cytotoxicity. The plates were incubated at 37 C in a 5% C02 incubator for about 3.5 
hours and then the color change was quantitated by reading absorbance in a Vmax plate 
30 reader (Molecular Devices) at 450nm/650 nm. Values were corrected by subtracting 
media-only background and presented as %viable with the untreated cell control 
representing 100%. 

Example 2 

35 



27 



WO 00/08469 



PCT/US99/17440 



Representative experiment and resulting data using Protocol of Example 1 . 

Compounds X, Y, and Z were evaluated in the Vaccinia Virus Infection/ 
5 Plasmid Transfection assay as outlined in Example 1 . BHK cells were seeded into 96 

well plates at a density of 1 x 10 4 cells/well and grown overnight to approximately 
85% confluency. The SEAP activity was monitored 48 hours post drug addition in 
cells transfected with either pHCAPI , pHCAP4, pTM3, or no DNA. Concurrently, 
Compounds X, Y, and Z were evaluated for cell cytotoxicity in a separate dose 
1 0 response assay using XTT to measure cell viability. 

For each compound, cells were infected with vTF7.3 followed by the plasmid 
transfection step. The arrangement of the cells transfected with one of the three 
plasmids is illustrated in Figure 9. 

15 

Results for Compounds X, Y, and Z are shown in Figures 4 A and 4B and 
Table 1 below. In the three graphs, the amount of SEAP activity detected in cells 
transfected with the pHCAPI plasmid ranges from 2 to 7-fold above the amount of 
20 SEAP detected in cells transfected with the control plasmids, pHCAP4 and pTM3, or 
cells only. The EC50 (pM) value represents the concentration of drug at which a 50% 

reduction in SEAP activity is observed relative to the amount of SEAP activity 
detected in the absence of drug. The CC50 (pM) value represents the concentration 

of drug at which a 50% reduction in cell viability is observed relative to cells in the 
25 absence of drug. The ratio of EC 50 / CC 50 yields the therapeutic index (Tl) which, by 

convention, should be greater or equal to 10 in order for a compound to be 
considered as demonstrating antiviral activity. 

Table 1 

30 



Compound 


EC 50 (MM) 


CC 50 (MM) 


Solubility (mM) 


Tl 


X 


45 


178 


= 100 


4 


Y 


>320 


112 


= 100 




Z 


>320 


112 


= 100 















28 



WO 00/08469 



PCT/US99/17440 



Within the compound dose range that was examined, only an EC50 value for 

Compound X was obtained. However, since the Tl value for Compound X was below 
10, it was concluded that Compound X does not represent a candidate inhibitor of 
5 NS3 protease activity. Compounds Y and Z did not demonstrate any efficacy in this 
system and, therefore, are not considered potential candidates (Figs. 4A and 4B). 

For Examples 3 and 4: 

1 0 XTT Cytotoxicity Assay 

XTT (Sigma 4626) was dissolved in phosphate buffered saline (PBS) to a final 
concentration of 1 mg/mL. 5 mL were prepared per plate. To this solution was added 
5 mM PMS (n-methyldibenzopyrazine methyl sulfate salt) (Sigma P9625) to a final 

1 5 concentration of 20 pM. This XTT substrate solution was diluted with an equal 

volume of MEM media containing 4% FBS(V/V). A 100pL/well of this final solution 
was added to the original plate which still contains the cell monolayer and about 50 pL 
incubation media. The plates were Incubated at 37 C in a 5% C02 incubator for 
about 3.5 hours and then the color change was quantitated by reading absorbance in 

20 a Vmax plate reader (Molecular Devices) at 450nm/650 nm. Values were corrected 
by subtracting media-only background and presented as %viable with the untreated 
cell control representing 100%. 

Example 3 

25 

Protocol for Dual Infection/Dose Response (DI/DR) Assay 
Day 1 

Flat-bottom 96-well plates were seeded with CV1 cells at a density of 1 x 10 5 
30 cells per well in MEM media containing 10% FBS with no Phenol Red. The plate was 
set up as shown in Figure 5. Media only was placed in all the wells on the edge of the 
plate and only one compound is evaluated per plate (Fig. 5). 

Day 2 
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Cells were infected with recombinant vaccinia viruses as follows. There 
should be about 1.5 x 10 5 cells per well after incubation for 24 hours. For every plate 
needed (a plate for each drug in the experiment) 4 mL of vTF7.3 in MEM with 4% FBS 
(-) phenol red at a concentration of 2 x 10 6 pfu/mL was prepared, and divided into 2 
5 mL aliquots. Either vHCAPI or vHCAP3 was added to the vTF7.3 aliquots for a final 
concentration of vHCAP of 1 x 10 7 pfu/mL At 75 \xL per well, this concentration of 
virus stock delivers vTF7.3 at an MOI of 1 and vHCAPI or vHCAP3 at an MOI of 5. 
The arrangement of the experimental plate is shown in Figure 5. 

10 Drug stock solutions for use in the assay, were made at a concentration of 40 

mM in DMSO as in the previous protocol. The 40 mM drug stock solution was diluted 
to 640 |aM in MEM with 4% FBS (-) phenol red to yield a 2X drug working stock 
solution. Using an empty 96 well plate, the drug dilution series was set up as follows: 

15 100 {xL of MEM with 4% FBS (-) phenol red was added to all wells in columns 

4-10. 150 \xL of 2X drug working stock solution was added to all wells in column 3. 46 
|iL of media was transferred from column 3 to wells of column 4 and mixed. 
Transferring of 46 \xL from column 4 to column 5 and out to row 10 was repeated. 
The remaining 46 was discarded. The arrangement of the experimental multiwell 

20 plate is shown in Figure 6. 

Media was aspirated from the CV1 monolayers. After aspiration, 75 \xL per 
well of appropriate virus inoculum or MEM with 4% FBS (-) phenol red was added to 
the CV1 monolayers, then 75 jxL was transferred from each well in the drug dilution 
25 series plate to the corresponding wells on the cell monolayer plate. The assay plate 
was incubated at 37 C in a 5% C0 2 incubator for 48 hours. 

At Day 3, the cells was microscopically checked for phenotypic changes 
around the 24 hour time point. At Day 4, 100 foL of media was collected from each 
30 well of which 50 )xL was used in the measurement of SEAP activity. The 100 p,L 

aliquots were transferred to an unused 96 well plate and after the plate was sealed, it 
was heated to 65 C for 30 minutes. 50 jxL of each heat treated sample was then 
transferred to its corresponding well in a new 96 well opaque plate (Dynex 7416). 
Using the Tropix® SEAP Phosphaiight™ kit, 50 mL of Tropix assay buffer was added 
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to each well and the plate was incubated at room temperature for 5 minutes. Next, 50 
\xL of Tropix reaction buffer/CPSD substrate was added and mixed. The plate was 
incubated for 90 minutes at room temperature. The chemiluminescence was then 
read using a Victor multi-label counter. The XTT assay for measuring cytotoxicity was 
5 also performed on Day 4 as described. 

Example 4 

Representative Experiment and Resulting Data Using Protocol of Example 3 

10 

Compounds A -I were evaluated in the Dl/ DR assay using the standard 
protocol given in Example 3. The data shown in Figure 7 and Figure 8 represent 
assay results obtained at a 48 hour time point post drug addition. 

15 The EC50 (MM) value represents the concentration of drug at which a 50% 

reduction in SEAP activity is observed relative to the amount of SEAP activity 
detected in the absence of drug. However, this latter value, the amount of SEAP 
activity that is observed in the absence of drug, is first corrected for assay background 
prior to the calculation of an EC50 value. The correction is made since in the inactive 

20 NS3 protease construct, vHCAP3, a background level of SEAP activity is detected 
(see SEAP Activity graph). This background SEAP activity represents non-NS3 
protease mediated SEAP activity and therefore should not be affected by the addition 
of an NS3 protease inhibitor. It is assumed that a fraction of the SEAP activity that is 
observed in the active NS3 protease construct, vHCAPI, represents non-NS3 

25 protease mediated SEAP activity. Therefore the amount of SEAP activity detected 
vHCAPI is corrected for the fraction that corresponds to non-NS3 protease mediated 
SEAP activity. The correction is as follows: luminescent units of SEAP activity of 
vHCAPI - luminescent units of SEAP activity of vHCAP3 = Value N (level of NS3 
protease dependent SEAP activity). Accordingly, (vHCAP1/SEAP)-N/2 = EC 50 value. 

30 

The CC50 (pM) value represents the concentration of drug at which a 50% 

reduction in cell viability is observed relative to cells in the absence of drug. The ratio 
of EC50/ CC50 yields the therapeutic index (Tl) which, by convention, should be 
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greater or equal to 1 0 in order for a compound to be considered as demonstrating 
antiviral activity. 

In Figure 7, increasing concentrations of Compound A were observed to have 
5 no affect on SEAP activity. In the cell cytotoxicity component of the assay, it was 
observed that increasing concentrations of Compound A did not result in a reduction 
of cell viability of cells alone or cells infected with either vHCAP1/vTF7.3 or 
vHCAP3/vTF7.3. The results obtained with Compounds B - I (Figure 8) demonstrate 
a range of observed cytotoxicities from 15 pM to >320 pM which is the upper limit of 
10 drug concentrations tested in the Dl/ DR assay although it is theoretically possible to 
test drug concentrations above 320 pM. The EC50 values that were observed for 

Compounds B - I ranged from 18 |jM to > 320 pM, however, the Tl values were under 
10. Thus Compounds A -I do not represent potential inhibitors of NS3 protease 
activity. 
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1 . A reporter gene system useful in the assessment of compounds which 
augment or inhibit the activity of Hepatitis C virus NS3 protease comprising: 

a) a recombinant viral vector comprising a DNA molecule encoding an 
RNA polymerase promoter compatible with said viral vector and which 
is expressed upon infection of a target mammalian cell; 

b) a recombinant plasmid comprising a DNA molecule encoding the 
HCV/SEAP reporter gene polyprotein which is expressed when 
transfected into a target mammalian cell; 

c) said target mammalian cell line being infected first with said 
recombinant viral vector then transfected with said recombinant 
plasmid such that the DNA molecule encoding the HCV/SEAP reporter 
gene is under transcriptional control of said promoter; and 

d) the target mammalian cell expressing said HCV/SEAP reporter gene 
polyprotein such that SEAP is secreted from said target mammalian 
cell. 



2. A reporter gene system useful in the assessment of compounds which 
augment or inhibit the activity of Hepatitis C virus NS3 protease comprising: 

a) a first recombinant viral vector comprising a DNA molecule encoding 
an RNA polymerase promoter compatible with said viral vector and 
which is expressed upon infection of a target mammalian cell; 

b) a second recombinant viral vector comprising a DNA molecule 
encoding the HCV/SEAP reporter gene polyprotein which is expressed 
upon infection of a target mammalian cell; 

c) said target mammalian cell line being infected first with said first 
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recombinant viral vector then co-infected with said second recombinant 
plasmid such that the DNA molecule encoding the HCV/SEAP reporter 
gene is under control of said promoter; and 

d) the target mammalian cell expresses said HCV/SEAP reporter gene 
polyprotein such that SEAP is secreted from said target mammalian 
cell. 

3. The reporter gene system of claim 1 wherein said recombinant plasmid is the 
pTM3 plasmid containing said HepC/SEAP construct. 

4. The recombinant plasmid of claim 3 wherein said recombinant plasmid 
comprises the pHCAPI plasmid having a DNA molecule encoding the NS2 
and NS3 protease polyproteins in a fusion protein fused with the SEAP gene 
according to the sequence in Seq. ID NO: 1. 

5. The recombinant plasmid of claim 3 wherein said recombinant plasmid further 
comprises the pHCAP3 plasmid containing the active NS2 protease and a 
mutant NS3 protease in a fusion protein fused with the SEAP gene according 
to the sequence in Seq. ID NO: 8. 

6. The recombinant plasmid of claim 3 wherein said recombinant plasmid further 
comprises the pHCAP4 plasmid containing the mutant inactive NS2 and 
mutant inactive NS3 protease in a fusion protein fused with the SEAP gene 
according to the sequence in Seq. ID NO: 15. 

7. The reporter gene system of claim 2 wherein said second recombinant viral 
vector further comprises the vHCAPI vector having a DNA molecule encoding 
the NS2 and NS3 protease polyproteins in a fusion protein fused with the 
SEAP gene according to the sequence in Seq. ID NO: 1. 

8. The reporter gene system of claim 2 wherein said second recombinant viral 
vector further comprises the VHCAP3 vector containing the active NS2 
protease and a mutant NS3 protease in a fusion protein fused with the SEAP 
gene according to the sequence in Seq. ID NO: 9. 
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9. The reporter gene system of claim 2 wherein said second recombinant viral 
vector further comprises the vHCAP4 vector containing the active NS2 
protease and a mutant NS3 protease in a fusion protein fused with the SEAP 
gene according to the sequence in Seq. ID NO: 16. 

10. The reporter gene system of claim 1 wherein said recombinant viral vector 
comprises a virus containing the DNA sequence encoding T7 RNA 
polymerase promoter. 

1 1 . The recombinant viral vector of claim 7 wherein said vector is the vTF7.3 
vector. 

12. The reporter gene system of claim 2 wherein said first recombinant viral vector 
comprises a virus containing the DNA sequence encoding the T7 RNA 
polymerase promoter. 

13. The recombinant viral vector of claim 9 wherein said vector is the vTF7.3 
vector, 

14. The reporter gene system of claim 1 wherein said first recombinant viral vector 
comprises a virus containing the DNA sequence encoding a vaccinia virus 
compatible promoter. 

15. The first recombinant viral vector of claim 1 1 wherein said vector is a vaccinia 
virus derived vector. 

16. The reporter gene system of claim 2 wherein said first recombinant viral vector 
comprises a virus containing the DNA sequence encoding a vaccinia virus 
compatible promoter. 

17. The first recombinant viral vector of claim 13 wherein said vector is a vaccinia 
virus derived vector. 
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18. A first recombinant viral vector according to claim 2 wherein the vector is 
pTM3 plasmid, a Listeria vector, an orthopox virus, avipox virus, canarypox 
virus, suipox virus, vaccinia virus, baculovirus, human adenovirus, SV40, 
Herpes Virus or bovine papilloma virus. 

1 9. A second recombinant viral vector according to claim 2 wherein the vector is 
pTM3 plasmid, a Listeria vector, an orthopox virus, avipox virus, canarypox 
virus, suipox virus, vaccinia virus, baculovirus, human adenovirus, SV40, 
Herpes Virus or bovine papilloma virus. 

20. The reporter gene system of claim 1 wherein said recombinant viral vector 
comprises a virus containing a the DNA sequence encoding a promoter 
selected from the group of mammalian viral vectors consisting of: 

Simian Virus 40 (SV40), Rous Sarcoma Virus (RSV), Adenovirus (ADV) and 
Bovine Papilloma Virus (BPV). 

21 . The reporter gene system of claim 2 wherein said recombinant viral vector 
comprises a virus containing a the DNA sequence encoding a promoter 
selected from the group of mammalian viral vectors consisting of: 

Simian Virus 40 (SV40), Rous Sarcoma Virus (RSV), Adenovirus (ADV) and 
Bovine Papilloma Virus (BPV). 



22. The reporter gene system of claim 1 wherein said target cell line is selected 
from the group consisting of: 

HeLa cells, Chinese Hamster Ovary cells, CV1 African Green Monkey cells, 
BSC 1 cells and Baby Hamster Kidney cells. 

23. The reporter gene system of claim 2 wherein said target cell line is selected 
from the group consisting of: 



36 



WO 00/08469 



PCT/US99/17440 



HeLa cells, Chinese Hamster Ovary cells, CV1 African Green Monkey cells, 
BSC 1 cells and Baby Hamster Kidney cells. 

24. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the HepC/SEAP reporter gene construct according to claim 1 . 

25. The isolated DNA sequence of claim 24 comprising a DNA sequence or 
variants thereof in SEQ. ID NO. 1. 

26. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as pHCAPI . 

27. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as pHCAP3. 

28. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as pHCAP4. 

29. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as vHCAPI . 

30. An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as vHCAP3. 

31 . An isolated DNA sequence comprising a DNA sequence or variants thereof 
encoding the sequence defined as VHCAP4. 

32. A method of assessing compounds which augment or inhibit the activity of 
Hepatitis C virus NS3 protease comprising: 

a) a control target mammalian cell; 

b) a first target mammalian cell expressing the pHCAPI polyprotein; 

c) a second target mammalian cell expressing the pHCAP4 polyprotein; 
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d) a third target mammalian cell expressing the viral promoter only; 

e) incubating said control, first, second, and third target mammalian cells 
for about 24 hours in a suitable growth medium in the presence and/or 
absence of pharmacologically effective concentrations of candidate 
compounds; 

f) measuring the amount of SEAP activity; and 

g) determining whether said candidate compounds augmented or 
inhibited hepatitis C NS3 protease by comparing the SEAP activity of 
said control, first, second, and third target mammalian cells. 

33. A method of assessing compounds which augment or inhibit the activity of 
Hepatitis C virus NS3 protease comprising: 

a) a control target mammalian cell; 

b) a first target mammalian cell expressing the vHCAPI polyprotein; 

c) a second target mammalian cell expressing the vHCAP4 polyprotein; 

d) a third target mammalian cell expressing the viral promoter only; 

e) incubating said control, first, second, and third target mammalian cells 
for about 24 hours in a suitable growth medium in the presence and/or 
absence of pharmacologically effective concentrations of candidate 
compounds; 

f) measuring the amount of SEAP activity; and 

g) determining whether said candidate compounds augmented or 
inhibited hepatitis C NS3 protease by comparing the SEAP activity of 
said control, first, second, and third target mammalian cells. 
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34. A method of assessing compounds which augment or inhibit the activity of 
Hepatitis C virus NS3 protease cis-only cleavage comprising: 

a) a control target mammalian cell; 

b) a first target mammalian cell expressing the pHCAP3 polyprotein; 

c) a second target mammalian cell expressing the pHCAP4 polyprotein; 

d) a third target mammalian cell expressing the viral promoter only; 

e) incubating said control, first, second, and third target mammalian cells 
for about 24 hours in a suitable growth medium in the presence and/or 
absence of pharmacologically effective concentrations of candidate 
compounds; 

f) measuring the amount of SEAP activity; and 

g) determining whether said candidate compounds augmented or 
inhibited hepatitis C NS3 protease by comparing the SEAP activity of 
said control, first, second, and third target mammalian cells. 

35. A process for constructing a reporter gene system useful in the assessment of 
compounds which augment or inhibit the activity of Hepatitis C virus NS3 protease 
comprising: 

a) 



b) 



providing a recombinant viral vector comprising a DNA molecule 
encoding an RNA polymerase promoter compatible with said viral 
vector and which is expressed upon infection of a target mammalian 
cell; 

providing a recombinant plasmid comprising a DNA molecule encoding 
the HCV/SEAP reporter gene polyprotein which is expressed when 
transfected into a target mammalian cell further comprising the steps of 
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cloning into a suitable vector the NS2-NS3-NS4A-NS4B' -NS5A/5B 
cleavage site - SEAP polyprotein; 

c) said target mammalian cell line being infected first with said 
recombinant viral vector then transfected with said recombinant 
plasmid such that the DNA molecule encoding the HCV/SEAP reporter 
gene is under transcriptional control of said promoter; and 

d) the target mammalian cell expressing said HCV/SEAP reporter gene 
polyprotein such that SEAP is secreted from said target mammalian 
cell. 

36. A process for constructing a reporter gene system useful in the assessment of 
compounds which augment or inhibit the activity of Hepatitis C virus NS3 protease 
comprising: 

a) providing a first recombinant viral vector comprising a DNA molecule 
encoding an RNA polymerase promoter compatible with said viral 
vector and which is expressed upon infection of a target mammalian 
cell; 

b) providing a second recombinant viral vector comprising a DNA 
molecule encoding the HCV/SEAP reporter gene polyprotein which is 
expressed when transfected into a target mammalian cell further 
comprising the steps of cloning into a suitable vector the NS2-NS3- 
NS4A-NS4B' -NS5A/5B cleavage site - SEAP polyprotein; 

c) said target mammalian cell line being infected first with said first 
recombinant viral vector then co-infected with said second recombinant 
plasmid such that the DNA molecule encoding the HCV/SEAP reporter 
gene is under control of said promoter; and 

d) the target mammalian cell expresses said HCV/SEAP reporter gene 
polyprotein such that SEAP is secreted from said target mammalian 
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cell. 

37. The isolated DNA sequence of claim 27 comprising a DNA sequence or 
variants thereof in SEQ. ID NO. 8. 

38. The isolated DNA sequence of claim 28 comprising a DNA sequence or 
variants thereof in SEQ. ID NO. 15. 

39. A composition comprising the pHCAPI polyprotein as described in SEQ. ID 
NO. 2. 

40. A composition comprising the pHCAP3 polyprotein as described in SEQ. ID 
NO. 9. 

41. A composition comprising the pHCAP4 polyprotein as described in SEQ. ID 
NO. 16. 
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Vaccinia Virus NS3/SEAP System 
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DI/ DR Assay Compound Summary 
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SEQUENCE LISTING 

<110> Potts, Karen E. 

Jackson, Roberta L. 
Patick, Amy K. 

<120> REPORTER GENE SYSTEM FOR USE IN CELL-BASED ASSESSMENT 
OF INHIBITORS OF THE HEPATITIS C VIRUS PROTEASE 

<130> 0125-0005A 

<140> 
<141> 

<150> 09/129, 611 
<151> 1998-08-05 

<160> 33 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 13910 
<212> DNA 

<213> Arrificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid phcap 1 

<220> 

<221> CDS 

<222> (497) . . (772) 

<220> 
<221> CDS 

<222> (1425) . . (6500) 

<220> 
<221> CDS 

<222> (8579) . . (9034) 

<220> 
<221> CDS 

<222> (10191) . . (10445) 

<220> 
<221> CDS 

<222> (11877) . . (12734) 
<220> 

<221> misc^feature 
<222> (1) . . (774) 

<223> Vaccinia Virus thymidine Kinase gene recombination 
site 

<220> 

<221> promoter 
' <222> (794) . . (816) 
<223> T7 promoter 

<220> 
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<221> misc_feature 
<222> (846) . - (1424) 

<223> EMC/Internal Ribosome Entry Site (IRES) 
<220> 

<221> misc_feature 

<222> (1426) . . (1437) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> misc_f eature 
<222> (1446) . . (2318) 
<223> HCV E2/ NS2 domain 

<220> 

<221> misc_f eature 
<222> (2319) . . (4231) 

<223> HCV NS3 Domain containing the serine protease and 
helicase enzymes 

<220> 

<221> misc_feature 

<222> (4203) . . (4260) 

<223> HCV NS3-NS4A cleavage site 

<220> 

<221> misc_feature 

<222> (4375) . . (4424) 

<223> HCV NS4A-4B clevage site 

<220> 

<221> misc^feature 
<222> '(4233) . . (4394) 
<223> HCV NS4A domain 

<220> 

<221> misc_feature 
<222> (4395) . . (4919) 
<223> HCV NS4B Domain 

<220> 

<221> misc_feature 

<222> (4920) . . (4991) 

<223> HCV NS5A-NS5B cleavage site 

<220> 

<221> misc_feature 
<222> (4992) . . (6501) 
<223> SEAP Protein 

<220> 

<221> misc_feature 

<222> (7915) . . (7945) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> terminator 
<222> (7938) . . (8078) 
<223> term T7 

<220> 
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<221> promoter 
<222> (8080) . . (8365) 

<223> Vacinina virus promoter; early/late promoter 
<220> 

<221> misc_feature 
<222> (8560) . . (11317) 

<223> E. coli gpt; for selection of recombinants 
<220> 

<221> misc_feature 
<222> (11318) . . (13909) 

<223> remaining DNA from 3' end of Tropix pCMV/SEAP 
piasmid 



<400> 1 
aagcttttgc 


gatcaataaa 


tggatcacaa 


ccagtatctc 


ttaacgatgt 


tcttcgcaga 


60 


tgatgattca 


ttttttaagt 


atttggctag 


tcaagatgat 


gaatcttcat 


tatctgatat 


120 


attgcaaatc 


actcaatatc 


tagactttct 


gttattatta 


ttgatccaat 


caaaaaataa 


180 


attagaagcc 


gtgggtcatt 


gttatgaatc 


t ctttcagag 


gaatacagac 


aatt gacaaa 


240 


attcacagac 


tttcaagatt 


ttaaaaaact 


gtttaacaag 


gtccctattg 


ttacagatgg 


300 


aagggtcaaa 


cttaataaag 


gatatttgtt 


cgactttgtg 


attagtttga 


tgcgattcaa 


360 


aaaagaatcc 


tctctagcta 


ccaccgcaat 


agatcctgtt 


agatacatag 


atcctcgtcg 


420 


caatatcgca 


ttttctaacg 


tgatggatat 


attaaagtcg 


aataaagtga 


acaataatta 


4 80 


attctttatt 


gtcatc atg- 
Met 
1 


aac ggc gga cat att cag ttg ata 
Asn Gly Gly His lie Gin Leu lie 
5 


ate ggc ccc 
lie Gly Pro 
10 


532 



atg ttt tea ggt aaa agt aca gaa tta att aga cga gtt aga cgt tat 580 
Met Phe Ser Gly Lys Ser Thr Glu Leu lie Arg Arg Val Arg Arg Tyr 
15 20 25 



caa ata get caa tat aaa tgc gtg act ata aaa tat tct aac gat aat 628 
Gin lie Ala Gin Tyr Lys Cys Val Thr lie Lys Tyr Ser Asn Asp Asn 
30 35 40 

aga tac gga acg gga eta tgg acg cat gat aag aat aat ttt gaa gca 676 
Arg Tyr Gly Thr Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala 
45 50 55 60 

ttg gaa gca act aaa eta tgt gat gtc ttg gaa tea att aca gat ttc 724 
Leu Glu Ala Thr Lys Leu Cys Asp Val Leu Glu Ser lie Thr Asp Phe 
65 70 75 

tec gtg ata ggt ate gat gaa gga cag ttc ttt cca gac att gtt gaa 772 
Ser Val lie Gly He Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu 
80 85 90 

ttgatctcga tcccgcgaaa ttaatacgac tcactatagg gagaccacaa cggtttccct 832 

etagegggat caattccgcc cctctccctc ccccccccct aacgttactg gccgaagccg 892 

cttggaataa ggccggtgtg cgtttgtcta tatgttartt tccaccatat tgccgtcttt 952 
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tggcaatgtg 


agggcccgga 


aacctggccc 


tgtcttcttg 


acgagcattc 


ctaggggtct 


1012 


ttcccctctc 


gccaaaggaa 


tgcaaggtct 


gttgaatgtc 


gtgaaggaag 


cagttcctct 


1072 


ggaagcttct 


tgaagacaaa 


caacgtctgt 


agcgaccctt 


tgcaggcagc 


ggaacccccc 


1132 


acctggcgac 


aggtgcctct 


gcggccaaaa 


gccacgtgta 


taagatacac 


ctgcaaaggc 


1192 


ggcacaaccc 


cagtgccacg 


ttgtgagttg 


gatagttgtg 


gaaagagtca 


aatggctctc 


1252 


ctcaagcgta 


ttcaacaagg 


ggctgaagga 


tgcccagaag 


gtaccccatt 


gtatgggatc 


1312 


tgatctgggg 


cctcggtgca 


catgctttac 


atgtgtttag 


tcgaggttaa 


aaaacgtcta 


1372 


ggccccccga 


accacgggga 


cgtggttttc 


ctttgaaaaa 


cacgataata 


cc atg gga 


1430 



Met Gly 

att ccc caa ttc atg gca cgt gtc tgt gcc tgc ttg tgg atg atg ctg 1478 
He Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu 
95 100 105 110 

ctg ata gcc cag gcc gag gcc gcc ttg gag aac ctg gtg gtc etc aat 1526 
Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn 
115 120 125 

gcg gcg tct gtg gcc ggc gca cat ggc ate etc tec ttc ctt gtg ttc 1574 
Ala Ala Ser Val Ala Gly Ala His Gly He Leu Ser Phe Leu Val Phe 
130 135 140 

ttc tgt gcc gcc tgg tac ate aaa ggc agg ctg gtc cct ggg gcg gca 1622 
Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly Ala Ala 
145 ' 150 155 

tat get ctt tat ggc gtg tgg ccg ctg etc ctg etc ttg ctg gca tta 1670 
Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu 
160 165 170 

cca ccg cga get tac gcc atg gac egg gag atg get gca teg tgc gga 1718 
Pro Pro Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser Cys Gly 
175 180 185 190 

ggc gcg gtt ttt gtg ggt ctg gta etc ctg act ttg tea cca tac tac 1766 
Gly Ala Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro Tyr Tyr 
195 200 205 

aag gtg ttc etc get agg etc ata tgg tag tta caa tat ttt ace ace 1814 
Lys Val Phe Leu Ala Arg Leu He Trp Trp Leu Gin Tyr Phe Thr Thr 
210 215 220 

aga gcc gag gcg cac tta cat gtg tgg ate ccc ccc etc aac get egg 18 62 
Arg Ala Glu Ala His Leu His Val Trp He Pro Pro Leu Asn Ala Arg 
225 230 235 

gga ggc cgc gat gcc ate ate etc etc atg tgc gca gtc cat cca gag 1910 
Gly Gly Arg Asp Ala He He Leu Leu Met Cys Ala Val His Pro Glu 
240 245 250 

eta ate ttt gac ate ace aaa ctt eta att gcc ata etc ggt ccg etc 1958 
Leu He Phe Asp He Thr Lys Leu Leu He Ala He Leu Gly Pro Leu 
255 260 ~ 265 270 
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atg gtg etc caa get ggc ata acc aga gtg ccg tac ttc gtg cgc get 2006 
Met Val Leu Gin Ala Gly lie Thr Arg Val Pro Tyr Phe Val Arg Ala 
275 280 285 

caa ggg etc att cat gca tgc atg tta gtg egg aag gtc get ggg ggt 2054 
Gin Gly Leu lie His Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly 
290 295 300 

cat tat gtc caa atg gee ttc atg aag ctg ggc gcg ctg aca ggc acg 2102 
His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr Gly Thr 
305 310 " 315 

tac att tac aac cat ctt acc ccg eta egg gat tgg gee cac gcg ggc 2150 
Tyr lie Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly 
320 325 330 

eta cga gac ctt gcg gtg gca gtg gag ccc gtc gtc ttc tec gac atg 2198 
Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser Asp Met 
335 " 340 345 350 

gag acc aag ate ate acc tgg gga gca gac acc gcg gcg tgt ggg gac 224 6 
Glu Thr Lys lie lie Thr Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp 
355 " 360 365 

ate ate ttg ggt ctg ccc gtc tec gee cga agg gga aag gag ata etc 2294 
lie lie Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu lie Leu 
370 375 380 

ctg ggc ccg gee gat agt ctt gaa ggg egg ggg tgg cga etc etc gcg 2342 
Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu Leu Ala 
385 390 395 

ccc ate acg gee tac tec caa cag acg egg ggc eta ctt ggt tgc ate 2390 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
400 405 410 

ate act age ctt aca ggc egg gac aag aac cag gtc gag gga gag gtt 2438 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
415 420 425 430 

cag gtg gtt tec acc gca aca caa tec ttc ctg gcg acc tgc gtc aac 2486 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
435 440 445 

ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc tta gee 2534 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
450 455 460 

ggc cca aag ggg cca- ate acc cag atg tac act aat gtg gac cag gac 2582 
Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
465 470 475 

etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca cca tgc 2630 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
480 485 " 490 

acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get gac gtc 2678 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
495 500 505 510 
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att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc tec ccc 2726 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
515 520 525 

agg cct gtc tec tac ttg aag ggc tct teg ggt ggt cca ctg etc tgc- 2774 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
530 535 540 

cct teg ggg cac get gtg ggc ate ttc egg get gee gta tgc ace egg 2822 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
545 550 555 

ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg gaa act 2870 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
560 565 570 

act atg egg tct ccg gtc ttc acg gac aac tea tec ccc ccg gee gta 2918 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
575 580 585 590 

ccg cag tea ttt caa gtg gee cac eta cac get ccc act ggc age ggc 2966 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
595 600 605 

aag agt act aaa gtg ccg get gca tat gca gee caa ggg tac aag gtg 3014 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
610 615 620 

etc gtc etc aat ccg tec gtt gee get ace tta ggg ttt ggg gcg tat 3062 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
625 630 635 

atg tct aag gca cac ggt att gac ccc aac ate aga act ggg gta agg 3110 
Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
640 645 650 

ace att ace aca ggc gee ccc gtc aca tac tct ace tat ggc aag ttt 3158 
Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
655 660 665 670 

ctt gee gat ggt ggt tgc tct ggg ggc get tat gac ate ata ata tgt 3206 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He ILe He Cys 
675 680 685 

gat gag tgc cat tea act gac teg act aca ate ttg ggc ate ggc aca 3254 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly He Gly Thr 
690 695 700 

gtc ctg gac caa gcg gag acg get gga gcg egg ctt gtc gtg etc gee 3302 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
705 710 715 

ace get acg cct ccg gga teg gtc acc gtg cca cac cca aac ate gag 3350 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
720 725 730 

gag gtg gee ctg tct aat act gga gag ate ccc ttc tat ggc aaa gee 3398 
Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
735 740 745 750 



6 



WO 00/08469 



PCT/US99/17440 



ate ccc att gaa gee ate agg ggg gga agg cat etc att ttc tgt cat 344 6 
lie Pro lie Glu Ala lie Arg Gly Gly Arg His Leu lie Phe Cys His 
755 760 765 

tec aag aag aag tgc gac gag etc gee gca aag ctg tea ggc etc gga 3494 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
770 775 780 

ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc ata cca 3542 
lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro 
785 790 795 

act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg acg ggc 3590 
Thr lie Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
800 805 810 

tat acg ggc gac ttt gac tea gtg ate gac tgt aac aca tgt gtc ace 3638 
Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr 
815 820 825 830 

cag aca gtc gac ttc age ttg gat ccc ace ttc ace att gag acg acg 3686 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr 
835 840 845 

ace gtg cct caa gac gca gtg teg cgc teg cag egg egg ggt agg act 3734 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
850 855 860 

ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga gaa egg 3782 
Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly Glu Arg 
865 870 875 

ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc tat gac gcg 3830 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
880 ~ 885 890 

ggc tgt get tgg tac gag etc ace ccc gee gag ace teg gtt agg ttg 3878 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
895 900 905 910 

egg gee tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac cac ctg 3926 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asjp His Leu 
915 920 925 

gag ttc tgg gag agt gtc ttc aca ggc etc ace cat ata gat gca cac 3974 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His 
930 935 940 

ttc ttg tec cag ace aag cag gca gga gac aac ttc ccc tac ctg gta 4022 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
945 950 " 955 

gca tac caa gee acg gtg tgc gee agg get cag gee cca cct cca tea 4070 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
960 965 970 

tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg ctg cac 4118 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
975 980 985 990 
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ggg cca aca ccc ttg ctg tac agg ctg gga gcc gtc caa aat gag gtc 4166 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
995 1000 1005 

acc etc acc cac ccc ata acc aaa tac ate atg gca tgc atg teg get 4214 
Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala 
1010 1015 1020 

gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga gtc ctt 4262 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
1025 1030 1035 

gca get ctg gcc gcg tat tgc ctg aca aca ggc agt gtg gtc att gtg 4310 
Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val lie Val 
1040 1045 1050 

ggt agg att ate ttg tec ggg agg ccg gcc att gtt ccc gac agg gag 4358 
Gly Arg lie lie Leu Ser Gly Arg Pro Ala lie Val Pro Asp Arg Glu 
1055 1060 1065 1070 

ctt etc tac cag gag ttc gat gaa atg gaa gag tgc gcc teg cac etc 4406 
Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu 
1075 1080 1085 

cct tac ate gag cag gga atg cag etc gcc gag caa ttc aag cag aaa 4454 
Pro Tyr lie Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys 
1090 1095 1100 

gcg etc ggg tta ctg caa aca gcc acc aaa caa gcg gag get get get 4502 
Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala 
1105 1110 1115 

ccc gtg gtg gag tec aag tgg cga gcc ctt gag aca ttc tgg gcg aag 4550 
Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys 
1120 1125 1130 



cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc tta tec 
His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser 
1135 1140 1145 1150 



4598 



act ctg cct ggg aac ccc gca ata gca tea ttg atg gca ttc aca gcc 4 64 6 
Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala 
1155 1160 1165 

tct ate acc age ccg etc acc acc caa agt acc etc ctg ttt aac ate 4694 
Ser lie Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn lie 
1170 1175 1180 

ttg ggg ggg tgg gtg get gcc caa etc gcc ccc ccc age gcc get teg 4742 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser 
1185 1190 1195 

get ttc gtg ggc gcc ggc ate gcc ggt gcg get gtt ggc age ata ggc 4790 
Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He Gly 
1200 1205 1210 

ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca gga gtg 4838 
Leu Gly Lys Val Leu Val Asp He Leu Ala Gly. Tyr Gly Ala Gly Val 
1215 1220 1225 1230 
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gcc ggc gcg etc gtg gec ttt aag gtc atg age ggc gag atg ccc tec 
Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser 
1235 1240 1245 



4886 



acc gag gac ctg gtc aat eta ctt cct gcc ate etc gag gaa get agt 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Glu Glu Ala Ser 
1250 1255 1260 



4934 



gag gat gtc gtc tgc tgc tea atg tec tac aca tgg aca ggc gcc ttg 
Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu 
1265 1270 1275 



4982 



gag ctg ctg ctg ctg ctg ctg ctg ggc ctg agg eta cag etc tec ctg 
Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu 
1280 1285 ' 1290 



5030 



ggc ate ate cca gtt gag gag gag aac .ccg gac ttc tgg aac cgc gag 
Gly lie lie Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu 
1295 1300 1305 1310 



5078 



gca gcc gag gcc ctg ggt gcc gcc aag aag ctg cag cct gca cag aca 
Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr 
1315 * 1320 1325 



5126 



gcc gcc aag aac etc ate ate ttc ctg ggc gat ggg atg ggg gtg tct 
Ala Ala Lys Asn Leu lie lie Phe Leu Gly Asp Gly Met Gly Val Ser 
1330 1335 1340 



5174 



acg gtg aca get gcc agg ate eta aaa ggg cag aag aag gac aaa ctg 
Thr Val Thr Ala Ala Arg lie Leu Lys Gly Gin Lys Lys Asp Lys Leu 
1345 1350 * 1355 



5222 



ggg cct gag ata ccc ctg gcc atg gac cgc ttc cca tat gtg get ctg 
Gly Pro Glu lie Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu 
1360 1365 1370 



5270 



tec aag aca tac aat gta gac aaa cat gtg cca gac agt gga gcc aca 
Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr 
1375 1380 1385 1390 



5318 



gcc acg gcc tac ctg tgc ggg gtc aag ggc aac ttc cag acc att ggc 
Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr lie Gly 
1395 1400 1405 



5366 



ttg agt gca gcc gcc cgc ttt aac cag tgc aac acg aca cgc ggc aac 
Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn 
1410 1415 1420 



5414 



gag gtc ate tec gtg atg aat egg gcc aag aaa gca ggg aag tea gtg 
Glu Val lie Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val 
1425 1430 1435 



5462 



gga gtg gta acc acc aca cga gtg cag cac gcc teg cca gcc ggc acc 
Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr 
1440 1445 1450 



5510 



tac gcc cac acg gtg aac- cgc aac tgg tac teg gac gcc gac gtg cct 
Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro 
1455 1460 1465 1470 



5558 
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gcc teg gec cgc cag gag ggg tgc cag gac ate get acg cag etc ate 
Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin Leu He 
1475 1480 1485 



5606 



tec aac atg gac att gac gtg ate eta ggt gga ggc cga aag tac atg 5654 
Ser Asn Met Asp lie Asp Val He Leu Gly Gly Gly Arg Lys Tyr Met 
1490 1495 1500 

ttt ccc atg gga acc cca gac cct gag tac cca gat gac tac age caa 5702 
Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin 
1505 1510 1515 

ggt ggg acc agg ctg gac ggg aag aat ctg gtg cag gaa tgg ctg gcg 5750 
Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala 
1520 1525 1530 

aag cgc cag ggt gcc egg tat gtg tgg aac cgc act gag ctg atg cag 5798 
Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin 
1535 1540 1545 1550 

get tec ctg gac ccg tct gtg acc cat etc atg ggt etc ttt gag cct 584 6 
Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro 
1555 1560 ~ 1565 

gga gap atg aaa tac gag ate cac cga gac tec aca ctg gac ccc tec 5894 
Gly Asp Met Lys Tyr Glu He His Arg Asp Ser Thr Leu Asp Pro Ser 
1570 1575 1580 

ctg atg gag atg aca gag get gcc ctg cgc ctg ctg age agg aac ccc 5942 
Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro 
1585 1590 * 1595 

cgc ggc ttc ttc etc ttc gtg gag ggt ggt cgc ate gac cat ggt cat 5990 
Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg He Asp His Gly His 
1600 1605 1610 

cat gaa age agg , get tac egg gca ctg act gag acg ate atg ttc gac 6038 
His Glu Ser Arg Ala Tyr Arg Ala Leu Thr. Glu Thr He Met Phe Asp 
1615 1620 1625 1630 

gac gcc att gag agg gcg ggc cag etc acc age gag gag gac acg ctg 6086 
Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu 
1635 1640 1645 

age etc gtc act gcc gac cac tec cac gtc ttc tec ttc gga ggc tac 6134 
Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr 
1650 1655 1660 

ccc ctg cga ggg age tgc ate ttc ggg ctg gcc cct ggc aag gcc egg 6182 
Pro Leu Arg Gly Ser Cys He Phe Gly Leu Ala Pro Gly Lys Ala Arg 
1665 1670 1675 

gac agg aag gcc tac acg gtc etc eta tac gga aac ggt cca ggc tat 6230 
Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr 
1680 1685 ~ 1690 

gtg etc aag gac ggc gcc egg ccg gat gtt acc gag age gag age ggg 6278 
Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly 
1695 1700 1705 1710 
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age ccc gag tat egg cag cag tea gca gtg ccc ctg gac gaa gag ace 6326 
Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr 
1715 1720 1725 

cac gca ggc gag gac gtg gcg gtg ttc gcg cgc ggc ccg cag gcg cac 6374 
His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His 
1730 1735 1740 

ctg gtt cac ggc gtg cag gag cag ace ttc ata gcg cac gtc atg gec 6422 
Leu Val His Gly Val Gin Glu Gin Thr Phe He Ala His Val Met Ala 
1745 1750 1755 

ttc gee gec tgc ctg gag ccc tac acc gec tgc gac ctg gcg ccc ccc 6470 
Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro 
1760 1765 1770 

gee ggc acc acc gac gec gcg cac ccg ggt taacccgtgg tccccgcgtt 6520 
Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1775 1780 

gcttcctctg ctggccggga catcaggtgg cccccgctga attggaatcg atattgttac 6580 
aacaccccaa catcttcgac gcgggcgtgg caggtcttcc cgacgatgac geeggtgaac 6640 
ttcccgccgc cgttgttgtt ttggagcacg gaaagacgat gaeggaaaaa gagatcgtgg 6700 
attaegtege cagtcaagta acaaccgcga aaaagttgcg eggaggagtt gtgtttgtgg 67 60 
acgaagtacc gaaaggtctt aceggaaaac tegaegcaag aaaaatcaga gagatcctca 6820 
taaaggccaa gaagggcgga aagtccaaat tgtaaaatgt aactgtattc agegatgacg 6880 
aaattcttag ctattgtaat actgegatga gtggcagggc ggggcgtaat ttttttaagg 6940 
cagttattgg tgcccttaaa cgcctggtgc tacgectgaa taagtgataa taagcggatg 7000 
aatggcagaa attegcegga tctttgtgaa ggaaccttac ttctgtggtg tgacataatt 7060 
ggacaaacta cctacagaga tttaaagctc taaggtaaat ataaaatttt taagtgtata 7120 
atgtgttaaa ctactgattc taattgtttg tgtattttag attccaacct atggaactga 7180 
tgaatgggag cagtggtgga atgcctttaa tgaggaaaac ctgttttgct cagaagaaat 7240 
gecatctagt gatgatgagg etactgetga ctctcaacat tctactcctc caaaaaagaa 7300 
gagaaaggta gaagacccca aggactttcc ttcagaattg ctaagttttt tgagtcatgc 7360 
tgtgtttagt aatagaactc ttgettgett tgetatttae accacaaagg aaaaagctgc 7420 
actgetatae aagaaaarta tggaaaaata ttctgtaacc tttataagta ggcataacag 7480 
ttataatcat aacatactgt tttttcttac tccacacagg catagagtgt ctgctattaa 7540 
taactatget caaaaattgt gtacctttag ctttttaatt tgtaaagggg ttaataagga 7600 
atatttgatg tatagtgect tgactagaga tcataatcag ccataccaca tttgtagagg 7660 
ttttacttgc tttaaaaaac ctcccacacc tccccctgaa cctgaaacat aaaatgaatg 7720 
caattgttgt tgttaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 7780 
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tcacaaattt 


cacaaataaa 


gcattttttt 


cactgcattc 


tagttgtggt 


ttgtccaaac 


7840 


tcatcaatgt 


atcttatcat 


gtctggatcc 


tctagagtcg 


acctgcaggc 


atgeaagett 


7900 


ctcgagagta 


cttctagtgg 


atccctgcag 


ctcgagaggc 


ctaattaatt 


aagtcgacga 


7960 


tccggctgct 


aacaaagccc 


gaaaggaagc 


tgagttggct 


gctgccaccg 


ctgagcaata 


8020 


actagcataa 


ccccttgggg 


cctctaaacg 


ggtcttgagg 


ggttttttgc 


tgaaaggagg 


8080 


aactatatcc 


ggagttaact 


cgacatatac 


tatatagtaa 


taccaatact 


caagactacg 


8140 


aaactgatac 


aatctcttat 


catgtgggta 


argttctcga 


tgtcgaatag 


ccatatgccg 


8200 


gtagttgcga 


tatacataaa 


ctgatcacta 


attccaaacc 


cacccgcttt 


ttatagtaag 


8260 


tttttcaccc 


ataaataata 


aatacaataa 


ttaatttctc 


gtaaaagtag 


aaaatatatt 


8320 


ctaatttatt 


gcacggtaag 


gaagtagaat 


cataaagaac 


agtgacggat 


cgatccccca 


8380 


agcttggaca 


caagacaggc 


ttgegagata 


tgtttgagaa 


taccacttta 


tcccgcgtca 


8440 


gggagaggca 


gtgcgtaaaa 


agacgeggae 


tcatgtgaaa 


tactggtttt 


tagtgegeca 


8500 


gatctctata 


atctcgcgca 


acctattttc 


ccctcgaaca 


ctttttaagc 


cgtagataaa 


8560 


caggctggga 


cacttcac atg age gaa 
Met Ser Glu 


aaa tac ate 
Lys Tyr He 


gtc ace tgg gac atg 
Val Thr Trp Asp Met 


8611 



1785 1790 1795 

ttg cag ate cat gca cgt aaa etc gca age cga ctg atg cct tct gaa 8659 
Leu Gin He His Ala Arg Lys Leu Ala Ser Arg Leu Met Pro Se ; r Glu 
1800 1805 1810 

caa tgg aaa ggc att att gee gta age cgt ggc ggt ctg gta ccg ggt 8707 
Gin Trp Lys Gly lie He Ala Val Ser Arg Gly Gly Leu Val Pro Gly 
1815 1820 1825 

gcg tta ctg gcg cgt gaa ctg ggt att cgt cat gtc gat ace gtt tgt 8755 
Ala Leu Leu Ala Arg Glu Leu Gly He Arg His Val Asp Thr Val Cys 
1830 1835 1840 

att tec age tac gat cac gac aac cag cgc gag ctt aaa gtg ctg aaa 8803 
He Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys 
1845 1850 1855 

cgc gca gaa ggc gat ggc gaa ggc ttc ate gtt att gat gac ctg gtg 8851 
Arg Ala Glu Gly Asp Gly Glu Gly Phe He Val He Asp Asp Leu Val 
1860 1865 1870 1875 

gat acc ggt ggt act gcg gtt gcg att cgt gaa atg tat cca aaa gcg 8899 
Asp Thr Gly Gly Thr Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala 
1880 1885 1890 

cac ttt gtc acc ate ttc gca aaa ccg get ggt cgt ccg ctg gtt gat 8947 
His Phe Val Thr He Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp 
1895 " 1900 " 1905 
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gac tat gtt gtt gat ate ccg caa gat acc tgg att gaa cag ccg tgg 8995 
Asp Tyr Val Val Asp He Pro Gin Asp Thr Trp He Glu Gin Pro Trp 
1910 1915 1920 

gat atg ggc gtc gta ttc gtc ccg cca ate tec ggt cgc taatcttttc 9044 
Asp Met Gly Val Val Phe Val Pro Pro He Ser Gly Arg 
1925 1930 1935 

aacgcctggc actgccgggc gttgttcttt ttaacttcag gcgggttaca atagtttcca 9104 

gtaagtattc tggaggctgc atccatgaca caggcaaacc tgagegaaac cctgttcaaa 9164 

ccccgcttta aacatcctga aacctcgacg ctagtccgcc gctttaatca cggcgcacaa 9224 

ccgcctgtgc agtcggccct tgatggtaaa accatccctc actggtatcg catgattaac 9284 

cgtctgatgt ggatctggcg eggcattgae ccacgcgaaa tcctcgacgt ccaggcacgt 934 4 

attgtgatga gegatgeega acgtaccgac gatgatttat aegataeggt gattggctac 9404 

cgtggcggca actggattta tgagtgggcc ccggatcttt gtgaaggaac cttacttctg 9464 

tggtgtgaca taattggaca aactacctac agagatttaa agctctaagg taaatataaa 9524 

atttttaagt gtataatgtg ttaaactact gattctaatt gtttgtgtat tttagattcc 9584 

aacctatgga actgatgaat gggagcagtg gtggaatgcc tttaatgagg aaaacctgtt 9644 

ttgctcagaa gaaatgecat ctagtgatga tgaggctact gctgactctc aacattctac 9704 

tcctccaaaa aagaagagaa aggtagaaga ccccaaggac tttccttcag aattgctaag 9764 

ttttttgagt catgctgtgt ttagtaatag aactcttget tgetttgeta tttacaccac 9824 

aaaggaaaaa gctgcactgc tatacaagaa aattatggaa aaatattctg taacctttat 9884 

aagtaggcat aacagttata atcataacat actgtttttt cttactccac acaggcatag 9944 

agtgtctgct attaataact atgctcaaaa attgtgtacc tttagctttt taatttgtaa 10004 

aggggttaat aaggaatatt tgatgtatag tgecttgact agagatcata ateagecata 10064 

ccacatttgt agaggtttta ettgetttaa aaaacctccc acacctcccc ctgaacctga 10124 

aacataaaat gaatgeaatt gttgttgtta agcttggggg aattgcatgc teeggatega 10184 

gatcaa ttc tgt gag cgt atg gca aac gaa gga aaa ata gtt ata gta 10232 
Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val 
1940 1945 1950 

gec gca etc gat ggg aca ttt caa cgt aaa ccg ttt aat aat att ttg 10280 
Ala Ala Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu 
1955 1960 1965 

aat ctt att cca tta tct gaa atg gtg gta aaa eta act get gtg tgt 10328 
Asn Leu lie Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys 
1970 1975 1980 

atg aaa tgc ttt aag gag get tec ttt tct aaa cga ttg ggt gag gaa 10376 
Met Lys Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu 
1985 1990 1995 
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acc gag ata gaa ata ata gga ggt aat gat atg tat caa teg gtg tgt 10424 
Thr Glu lie Glu lie lie Gly Gly Asn Asp Met Tyr Gin Ser Val Cys 
2000 2005 2010 

aga aag tgt tac ate gac tea taatattata ttttttatct aaaaaactaa 10475 
Arg Lys Cys Tyr lie Asp Ser 
2015 " 2020 

aaataaacat tgattaaatt ttaatataat acttaaaaat ggatgttgtg tegttagata 10535 

aacegtttat gtattttgag gaaattgata atgagttaga ttacgaacca gaaagtgcaa 10595 

atgaggtege aaaaaaactg ccgtatcaag gacagttaaa actattacta ggagaattat 10655 

tttttcttag taagttacag egacaeggta tattagatgg tgccaccgta gtgtatatag 10715 

gatctgctcc eggtacacat ataegttatt tgagagatca tttctataat ttaggagtga 10775 

tcatcaaatg gatgetaatt gacggccgcc atcatgatcc tattttaaat ggattgcgtg 10835 

atgtgactct agtgactcgg ttcgttgatg aggaatatct acgatccatc aaaaaacaac 10895 

tgcatcctte taagattatt ttaatttctg atgtgagatc caaacgagga ggaaatgaac 10955 

etagtaegge ggatttacta agtaattacg ctctacaaaa tgtcatgatt agtattttaa 11015 

accccgtggc gtctagtctt aaatggagat gcccgtttcc agatcaatgg atcaaggact 11075 

tttatatccc acaeggtaat aaaatgttac aaccttttgc tccttcatat teagggcegt 11135 

cgttttacaa cgtcgtgact gggaaaaccc tggcgttacc caacttaatc gccttgcagc 11195 

acatccccct ttcgccagct ggegtaatag egaagaggee cgcaccgatc gcccttccca 11255 

acagttgege agectgaatg gcgaatggcg cgacgcgccc tgtagcggcg cattaagege 11315 

ggcgggtgtg gtggttacgc geagegtgae cgctacactt gccagcgccc tagcgcccgc 11375 

tcctttcgct ttcttccctt cctttctcgc cacgttcgcc ggctttcccc gtcaagctct 11435 

aaatcggggg ctccctttag ggttccgatt tagtgettta cggcacctcg. accccaaaaa 114 95 

acttgattag ggtgatggtt cacgtagtgg gccatcgccc tgatagaegg tttttcgccc 11555 

tttgacgttg gagtccacgt tctttaatag tggactcttg ttccaaactg gaacaacact 11615 

caaccctatc teggtctatt cttttgattt ataagggatt ttgccgattt eggectattg 11675 

gttaaaaaat gagctgattt aacaaaaatt taacgcgaat tttaacaaaa tattaaegtt 11735 

tacaatttcc caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt 11795 

ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 11855 

taatattgaa aaaggaagag t atg agt att caa cat ttc cgt gtc gec ctt 11906 

Met Ser lie Gin His Phe Arg Val Ala Leu 
2025 2030 
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att ccc ttt ttt gcg gca ttt tgc ctt cct gtt ttt get cac cca gaa 11954 
lie Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu 
2035 2040 2045 

acg ctg gtg aaa gta aaa gat get gaa gat cag ttg ggt gca cga gtg 12002 
Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val 
2050 2055 2060 

ggt tac ate gaa ctg gat etc aac age ggt aag ate ctt gag agt ttt 12050 
Gly Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu Ser Phe 
2065 2070 2075 

cgc ccc gaa gaa cgt ttt cca atg atg age act ttt aaa gtt ctg eta 12098 
Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu 
2080 2085 2090 2095 

tgt ggc gcg gta tta tec cgt att gac gee ggg caa gag caa etc ggt 12146 
Cys Gly Ala Val Leu Ser Arg lie Asp Ala Gly Gin Glu Gin Leu Gly 
2100 2105 2110 

cgc cgc ata cac tat tct cag aat gac ttg gtt gag tac tea cca gtc 12194 
Arg Arg lie His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val 
2115 2120 2125 

aca gaa aag cat ctt acg gat ggc atg aca gta aga gaa tta tgc agt 12242 
Thr Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser 
2130 2135 2140 

get gec ata acc atg agt gat aac act gcg gee aac tta ctt ctg aca 12290 
Ala Ala lie Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr 
2145 2150 2155 

acg ate 'gga gga ccg aag gag eta acc get ttt ttg cac aac atg ggg 12338 
Thr lie Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly 
2160 2165 2170 2175 

gat cat gta act cgc ctt gat cgt tgg gaa ccg gag ctg aat gaa gee 12386 
Asp His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala 
2180 ' 2185 2190 

ata cca aac gac gag cgt gac acc acg atg cct gta gca atg gca aca 12434 
lie Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Me>t Ala Thr 
2195 2200 2205 

acg ttg cgc aaa eta tta act ggc gaa eta ctt act eta get tec egg 12482 
Thr Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg 
2210 2215 2220 

caa caa tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt 12530 
Gin Gin Leu lie Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu 
2225 2230 2235 

ctg cgc teg gee ctt ccg get ggc tgg ttt att get gat aaa tct gga 12578 
Leu Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly 
2240 " 2245 2250 2255 

gee ggt gag cgt ggg tct cgc ggt ate att gca gca ctg ggg cca gat 12626 
Ala Gly Glu Arg Gly Ser Arg Gly lie lie Ala Ala Leu Gly Pro Asp 
2260 2265 2270 
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ggt aag ccc tec cgt ate gta gtt ate tac acg acg ggg agt cag gca 
Gly Lys Pro Ser Arg He Val Val He Tyr Thr Thr Gly Ser Gin Ala 
2275 2280 2285 



12674 



act atg gat gaa cga aat aga cag ate get gag ata ggt gee tea ctg 
Thr Met Asp Glu Arg Asn Arg Gin He Ala Glu He Gly Ala Ser Leu 
2290 2295 2300 



12722 



att aag cat tgg taactgtcag accaagttta ctcatatata ctttagattg 
He Lys His Trp 
2305 



12774 



atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca 12834 
tgaccaaaat cccttaacgt gagttttcgt tecactgagc gtcagacccc gtagaaaaga 12894 
tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 12954 
aaccaccgct accageggtg gtttgtttgc eggatcaaga gctaccaact ctttttccga 13014 
aggtaactgg cttcagcaga gegcagatae caaatactgt ccttctagtg tagcegtagt 13074 
taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt 13134 
taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat 13194 
agttaccgga taaggcgcag eggteggget gaacgggggg ttcgtgcaca cagcccagct 13254 
tggagegaac gacctacacc gaactgagat acctacagcg tgagctatga gaaagegeca 13314 
cgcttcccga agggagaaag gcggacaggt ateeggtaag eggcagggtc ggaacaggag 13374 
agegcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc 13434 
gccacctctg acttgagegt cgatttttgt gatgetegtc aggggggegg agcctatgga 134 94 
aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgetcaca 13554 
tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag 13614 
ctgataccgc tcgccgcagc cgaacgaccg agegcagega gtcagtgagc gaggaagegg 13674 
aagagcgccc aatacgcaaa ccgcctctcc ccgcgcgttg gecgattcat taatgeaget 13734 
ggcacgacag gtttcccgac tggaaagegg gcagtgagcg caaegcaatt aatgtgagtt 13794 
agctcactca ttaggcaccc caggctttac actttatget tccggctcgt atgttgtgtg 13854 
gaattgtgag eggataacaa tttcacacag gaaacagcta tgaccatgat tacgee 13910 

<210> 2 
<211> 2307 
<212> PRT 

<213> Artificial Sequence 
<400> 2 

Met Asn Gly Gly His He Gin Leu He He Gly Pro Met Phe Ser Gly 
15 10 15 
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Lys Ser Thr Glu Leu lie Arg Arg Val Arg Arg Tyr Gin lie Ala Gin 
20 25 30 

Tyr Lys Cys Val Thr lie Lys Tyr Ser Asn Asp Asn Arg Tyr Gly Thr 
35 40 45 

Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala Leu Glu Ala Thr 
50 55 60 

Lys Leu Cys Asp Val Leu Glu Ser lie Thr Asp Phe Ser Val lie Gly 
65 " 70 75 80 

lie Asp Glu Gly Gin Phe Phe Pro Asp lie Val Glu Met Gly lie Pro 
85 90 95 

Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu lie 
100 105 110 

Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala 
115 120 125 

Ser Val Ala Gly Ala His Gly lie Leu Ser Phe Leu Val Phe Phe Cys 
130 135 140 

Ala Ala Trp Tyr lie Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala 
145 " 150 155 160 

Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 
165 170 175 

Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala 
180 185 190 

Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val 
195 200 205 

Phe Leu Ala Arg Leu lie Trp Trp Leu Gin Tyr Phe Thr Thr Arg Ala 
210 215 * 220 

Glu Ala His Leu His Val Trp lie Pro Pro Leu Asn Ala Arg Gly Gly 
225 230 235 240 

Arg Asp Ala lie lie Leu Leu Met Cys Ala Val His Pro Glu Leu lie 
245 250 255 

Phe Asp lie Thr Lys Leu Leu lie Ala lie Leu Gly Pro Leu Met Val 
260 265 270 

Leu Gin Ala Gly lie Thr Arg Val Pro Tyr Phe Val Arg Ala Gin Gly 
275 280 285 

Leu lie His Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr 
290 295 300 

Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr Gly Thr Tyr He 
305 310 315 320 

Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg 
325 330 335 
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Asp Leu Ala Val 
340 

Lys lie lie Thr 
355 

Leu Gly Leu Pro 
370 

Pro Ala Asp Ser 
385 

Thr Ala Tyr Ser 



Ser Leu Thr Gly 
420 

Val Ser Thr Ala 
435 

Cys Trp Thr Val 
4 50 

Lys Gly Pro lie 
465 

Gly Trp Gin Ala 



Gly Ser Ser Asp 
500 



Val Arg Arg Arg 
515 

Val Ser Tyr Leu 
530 

Gly His Ala Val 
545 

Ala Lys Ala Val 



Arg Ser Pro Val 
580 

Ser Phe Gin Val 
595 

Thr Lys Val Pro 
610 

Leu Asn • Pro Ser 
625 

Lys Ala His Gly 



Ala Val Glu Pro 



Trp Gly Ala Asp 
360 

Val Ser Ala Arg 
375 

Leu Glu Gly Arg 
390 

Gin Gin Thr Arg 
405 

Arg Asp Lys Asn 



Thr Gin Ser Phe 
440 

Tyr His Gly Ala 
455 

Thr Gin Met Tyr 
470 

Pro Pro Gly Ala 
485 

Leu Tyr Leu Val 



Gly Asp Ser Arg 
520 

Lys Gly Ser Ser 
535 

Gly lie Phe Arg 
550 

Asp Phe Val Pro 
565 

Phe Thr Asp Asn 



Ala His Leu His 
600 

Ala Ala Tyr Ala 
615 

Val Ala Ala Thr 
630 

lie Asp Pro Asn 
645 



Val Val Phe Ser 
345 

Thr Ala Ala Cys 



Arg Gly Lys Glu 
380 

Gly Trp Arg Leu 
395 

Gly Leu Leu Gly 
410 

Gin Val Glu Gly 

425 . 

Leu Ala Thr Cys 



Gly Ser Lys Thr 
460 

Thr Asn Val Asp 
475 

Arg Ser Leu Thr 
490 

Thr Arg His Ala 
505 

Gly Ser Leu Leu 



Gly Gly Pro Leu 
540 

Ala Ala Val Cys 
555 

Val Glu Ser Met 
570 

Ser Ser Pro Pro 
585 

Ala Pro Thr Gly 



Ala Gin Gly Tyr 
620 

Leu Gly Phe Gly 
635 

lie Arg Thr Gly 
650 



Asp Met Glu Thr 
350 

Gly Asp He He 
365 

He Leu Leu Gly 



Leu Ala Pro He 
400 

Cys He He Thr 
415 

Glu Val Gin Val 
430 

Val Asn Gly Val 
445 

Leu Ala Gly Pro 



Gin Asp Leu Val 
480 

Pro Cys Thr Cys 
495 

Asp Val He Pro 
510 

Ser Pro Arg Pro 
525 

Leu Cys Pro Ser 



Thr Arg Gly Val 
560 

Glu Thr Thr Met 
575 

Ala Val Pro Gin 
590 

Ser Gly Lys Ser 
605 

Lys Val Leu Val 



Ala Tyr Met Ser 
640 

Val Arg Thr lie 
655 
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Thr Thr Gly Ala 
660 

Asp Gly Gly Cys 
675 

Cys His Ser Thr 
690 

Asp Gin Ala Glu 
705 

Thr Pro Pro Gly 



Ala Leu Ser Asn 
740 

He Glu Ala He 
755 

Lys Lys Cys Asp 

770 

Ala Val Ala Tyr 
785 

Gly Asp Val Va,l 



Gly Asp Phe Asp 
820 

Val Asp Phe Ser 
835 

Pro Gin Asp Ala 
850 

Gly Arg Arg Gly 
865 

Gly Met Phe Asp 



Ala Trp Tyr Glu 
900 

Tyr Leu Asn Thr 
915 

Trp Glu Ser Val 
930 

Ser Gin Thr Lys 
945 

Gin Ala Thr Val 



Pro Val Thr Tyr 



Ser Gly Gly Ala 
680 

Asp Ser Thr Thr 
695 

Thr Ala Gly Ala 
710 

Ser Val Thr Val 
725 

Thr Gly Glu He 



Arg Gly Gly Arg 
760 

Glu Leu Ala Ala 

775 

Tyr Arg Gly Leu 
790 

Val Val Ala Thr 
805 

Ser Val He Asp 



Leu Asp Pro Thr 
840 

Val Ser Arg Ser 
855 

He Tyr Arg Phe 
870 

Ser Ser Val Leu 
885 

Leu Thr Pro Ala 



Pro Gly Leu Pro 
920 

Phe Thr Gly Leu 
935 

Gin Ala Gly Asp 
950 

Cys Ala Arg Ala 
965 



Ser Thr Tyr Gly 
665 

Tyr Asp He He 



He Leu Gly He 
700 

Arg Leu Val Val 
715 

Pro His Pro Asn 
730 

Pro Phe Tyr Gly 
745 . 

His Leu He Phe 



Lys Leu Ser Gly 
780 

Asp Val Ser Val 
795 

Asp Ala Leu Met 
810 

Cys Asn Thr Cys 
825 

Phe Thr He Glu 



Gin Arg Arg Gly 
860 

Val Thr Pro Gly 

875 

Cys Glu Cys Tyr 
8 90 

Glu Thr Ser Val 
905 

Val Cys Gin Asp 



Thr His He Asp 
940 



Asn Phe Pro Tyr 
955 

Gin Ala Pro Pro 
970 



Lys Phe Leu Ala 
67 0 

He Cys Asp Glu 
685 

Gly Thr Val Leu 



Leu Ala Thr Ala 
720 

He Glu Glu Val 
735 

Lys Ala He Pro 
750 

Cys His Ser Lys 
765 

Leu Gly He Asn 



He Pro Thr He 
800 

Thr Gly Tyr Thr 
815 

Val Thr Gin Thr 
830 

Thr Thr Thr Val 
845 

Arg Thr Gly Arg 



Glu Arg Pro Ser 
880 

Asp Ala Gly Cys 
895 

Arg Leu Arg Ala 
910 

His Leu Glu Phe 
925 

Ala His Phe Leu 



Leu Val Ala Tyr 
960 

Pro Ser Trp Asp 
975 
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Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro 
980 985 990 

Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu 
995 J 1000 1005 

Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu 
1010 1015 1020 

Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala 
025 1030 1035 1040 

Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg 
1045 1050 1055 

He He Leu Ser Gly Arg Pro Ala He Val Pro Asp Arg Glu Leu Leu 
1060 ~ 1065 1070 

Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr 
1075 1080 1085 

He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu 
1090 1095 1100 

Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val 
105 H10 H15 H20 

Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys His Met 
1125 1130 H35 

Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1140 1145 H50 

Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He 
1155 1160 H65 

Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn He Leu Gly 
1170 1175 H80 

Gly ^rp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe 
185 H90 1195 1200 

Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly 
1205 1210 1215 

Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly 
1220 1225 1230 

Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu 
1235 1240 1245 

Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu Ala Ser Glu Asp 
1250 1255 1260 

Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Glu Leu 
265 1270 1275 1280 

Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu Gly He 
1285 ~ 1290 1295 



20 



WO 00/08469 



PCT/US99/17440 



lie Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu Ala Ala 
1300 1305 1310 

Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr Ala Ala 
1315 1320 1325 

Lys Asn Leu lie lie Phe Leu Gly Asp Gly Met Gly Val Ser Thr Val 
1330 1335 1340 

Thr Ala Ala Arg lie Leu Lys Gly Gin Lys Lys Asp Lys Leu Gly Pro 
345 1350 " 1355 1360 

Glu lie Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu Ser Lys 
1365 1370 1375 

Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr Ala Thr 
1380 1385 . 1390 

Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr lie Gly Leu Ser 
1395 1400 1405 

Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn Glu Val 
1410 1415 1420 

lie Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val Gly Val 
425 1430 1435 1440 

Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr Tyr Ala 
1445 1450 1455 

His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro Ala Ser 
1460 1465 1470 

Ala Arg Gin Glu Gly Cys Gin Asp lie Ala Thr Gin Leu lie Ser Asn 
1475 1480 1485 

Met Asp lie Asp Val lie Leu Gly Gly Gly Arg Lys Tyr Met Phe Pro 
1490 1495 1500 

Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin Gly Gly 
505 1510 1515 1520 

Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala Lys Arg 
1525 1530 1535 

Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin Ala Ser 
1540 1545 1550 

Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro Gly Asp 
1555 1560 1565 

Met Lys Tyr Glu lie His Arg Asp Ser Thr Leu Asp Pro Ser Leu Met 
1570 1575 1580 

Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro Arg Gly 
585 1590 1595 1600 

Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His Gly His His Glu 
1605 1610 1615 
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Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr lie Met Phe Asp Asp Ala 
1620 1625 1630 

lie Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu Ser Leu 
1635 1640 1645 

Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr Pro Leu 
1650 1655 1660 

Arg Gly Ser Cys lie Phe Gly Leu Ala Pro Gly Lys Ala Arg Asp Arg 
665 1670 1675 1680 

Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr Val Leu 
1685 1690 1695 

Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly Ser Pro 
1700 1705 . 1710 

Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr His Ala 
1715 1720 1725 

Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His Leu Val 
1730 1735 1740 

His Gly Val Gin Glu Gin Thr Phe lie Ala His Val Met Ala Phe Ala 
745 ' 1750 1755 1760 

Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro Ala Gly 
1765 1770 1775 

Thr Thr Asp Ala Ala His Pro Gly Met Ser Glu Lys Tyr lie Val Thr 
1780 785 1790 

Trp Asp Met Leu Gin He His Ala Arg Lys Leu Ala Ser Arg Leu Met 
1795 1800 1805 

Pro Ser Glu Gin Trp Lys Gly He He Ala Val Ser Arg Gly Gly Leu 
1810 1815 1820 

Val Pro Gly Ala Leu Leu Ala Arg Glu Leu Gly He Arg His Val Asp 
825 " 1830 1835 1840 

Thr Val Cys He Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys 
1845 1850 1855 

Val Leu Lys Arg Ala Glu Gly Asp Gly Glu Gly Phe He Val He Asp 
1860 1865 1870 

Asp Leu Val Asp Thr Gly Gly Thr Ala Val Ala He Arg Glu Met Tyr 
1875 1880 1885 

Pro Lys Ala His Phe Val Thr He Phe Ala Lys Pro Ala Gly Arg Pro 
1890 1895 1900 

Leu Val Asp Asp Tyr Val Val Asp He Pro Gin Asp Thr Trp He Glu 
905 1910 1915 1920 

Gin Pro Trp Asp Met Gly Val Val Phe Val Pro Pro He Ser Gly Arg 
1925 1930 1935 
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Phe Cys Glu Arg Met Ala Asn Glu Gly Lys lie Val lie Val Ala Ala 
1940 1945 1950 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn lie Leu Asn Leu 
1955 1960 1965 

lie Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
1970 1975 1980 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
985 1990 1995 2000 

lie Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
2005 2010 2015 

Cys Tyr He Asp Ser Met Ser He Gin His Phe Arg Val Ala Leu He 
2020 2025 2030 

Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr 
2035 2040 2045 

Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly 
2050 2055 2060 

Tyr He Glu Leu Asp Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg 
065 2070 2075 208 

Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys 
2085 2090 2095 

Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg 
2100 2105 2110 

Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 
2115 2120 2125 

Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala 
2130 2135 2140 

Ala He Thr Met Ser Aso Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr 
145 2150 2155 216 

He Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp 
2165 2170 2175 

His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He 
2180 21&5 2190 

Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr 
2195 2200 2205 

Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin 
2210 2215 2220 

Gin Leu He Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu 
225 2230 2235 224 

Arg Ser Ala Leu Pro Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala 
2245 ' 2250 2255 
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Gly Glu Arg Gly Ser Arg Gly lie He Ala 
2260 2265 

Lys Pro Ser Arg He Val Val He Tyr Thr 
2275 2280 

Met Asp Glu Arg Asn Arg Gin He Ala Glu 
2290 2295 



PCT/US99/17440 

Ala Leu Gly Pro Asp Gly 
2270 

Thr Gly Ser Gin Ala. Thr 
2285 

He Gly Ala Ser Leu He 
2300 



Lys His Trp 
305 



<210> 3 
<211> 92 
<212> PRT 

<213> Artificial Sequence 



<400> 3 

Met Asn Gly Gly His He Gin Leu He He Gly Pro Met Phe Ser Gly 
15 10 15 

Lys Ser Thr Glu Leu He Arg Arg Val Arg Arg Tyr Gin He Ala Gin 
20 ~ 25 30 



Tyr Lys Cys Val Thr He Lys Tyr 
35 40 

Gly Leu Trp Thr His Asp Lys Asn 
50 55 

Lys Leu Cys Asp Val Leu Glu Ser 
65 70 

He Asp Glu Gly Gin Phe Phe Pro 
85 



Ser Asn Asp Asn Arg Tyr Gly Thr 
45 

Asn Phe Glu Ala Leu Glu Ala Thr 
60 

lie Thr Asp Phe Ser Val He Gly 
75 80 

Asp He Val Glu 
90 



<210> 4 
<211> 1692 
<212> PRT 

<213> Artificial Sequence 



<400> 4 

Met Gly He Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met 
15 10 15 

Met Leu Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val 
20 25 30 



Leu Asn Ala Ala Ser Val Ala Gly Ala His Gly He Leu Ser Phe Leu 
35 40 45 

Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly 
50 55 60 



Ala Ala Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu 

65 70 75 80 

Ala Leu Pro Pro Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser 

85 90 95 
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Cys Gly Giy Ala Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro 
100 105 110 

Tyr Tyr Lys Val Phe Leu Ala Arg Leu lie Trp Trp Leu Gin Tyr Phe 
115 120 125 

Thr Thr Arg Ala Glu Ala His Leu His Val Trp lie Pro Pro Leu Asn 
130 135 140 

Ala Arg Gly Gly Arg Asp Ala lie He Leu Leu Met Cys Ala Val His 
145 " 150 155 160 

Pro Glu Leu He Phe Asp He Thr Lys Leu Leu He Ala He Leu Gly 
165 170 175 

Pro Leu Met Val Leu Gin Ala Gly He Thr Arg Val Pro Tyr Phe Val 
180 185 190 

Arg Ala Gin Gly Leu He His Ala Cys Met Leu Val Arg Lys Val Ala 
195 200 205 

Gly Gly His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr 
210 " 215 220 

Gly Thr Tyr He Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His 
225 230 235 240 

Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser 
245 250 255 

Asp Met Glu Thr Lys He He Thr Trp Gly Ala Asp Thr Ala Ala Cys 
260 265 270 

Gly Asp He He Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu 
275 280 285 

He Leu Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu 
290 295 300 

Leu Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
305 310 315 320 

Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
325 330 335 

Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 
340 345 350 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 
355 360 365 

Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp 
370 375 380 

Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 
385 ^ 390 395 400 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
405 410 415 
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Asp Val lie Pro 
420 

Ser Pro Arg Pro 
435 

Leu Cys Pro Ser 
450 

Thr Arg Gly Val 
465 

Glu Thr Thr Met 



Ala Val Pro Gin 
500 

Ser Gly Lys Ser 
515 

Lys Val Leu Val 
530 

Ala Tyr Met Ser 
545 

Val Arg Thr He 



Lys Phe Leu Ala 
580 

He Cys Asp Glu 
595 

Gly Thr Val Leu 
610 

Leu Ala Thr Ala 
625 

He Glu Glu Val 



Lys Ala He Pro 
660 

Cys His Ser Lys 
675 

Leu Gly He Asn 
690 

He Pro Thr He 
705 

Thr Gly Tyr Thr 



Val Arg Arg Arg 



Val Ser Tyr Leu 
440 

Gly His Ala Val 
455 

Ala Lys Ala Val 
470 

Arg Ser Pro Val 
485 

Ser Phe Gin Val 



Thr Lys Val Pro 
520 

Leu Asn Pro Ser 

535 

Lys Ala His Gly 
550 

Thr Thr Gly Ala 
565 

Asp Gly Gly Cys 



Cys His Ser Thr 
600 

Asp Gin Ala Glu 
615 

Thr Pro Pro Gly 
630 

Ala Leu Ser Asn 
645 

He Glu Ala He 



Lys Lys Cys Asp 
680 

Ala Val Ala Tyr 
695 

Gly Asp Val Val 
710 

Gly Asp Phe Asp 
725 



Gly Asp Ser Arg 
425 

Lys Gly Ser Ser 



Gly He Phe Arg 
460 

Asp Phe Val Pro 
475 

Phe Thr Asp Asn 
4 90 

Ala His Leu His 
505 . 

Ala Ala Tyr Ala 



Val Ala Ala Thr 

540 

Tie Asp Pro Asn 
555 

Pro Val Thr Tyr 
570 

Ser Gly Gly Ala 
585 

Asp Ser Thr Thr 



Thr Ala Gly Ala 
620 

Ser Val Thr Val 
635 

Thr Gly Glu He 
650 

Arg Gly Gly Arg 
665 

Glu Leu Ala Ala 



Tyr Arg Gly Leu 
700 

Val Val Ala Thr 
715 

Ser Val He Asp 
730 



Gly Ser Leu Leu 
4 30 

Gly Gly Pro Leu 
445 

Ala Ala Val Cys 



Val Glu Ser Met 
480 

Ser Ser Pro Pro 
495 

Ala Pro Thr Gly 
510 

Ala Gin Gly Tyr 
525 

Leu Gly Phe Gly 



He Arg Thr Gly 
560 

Ser Thr Tyr Gly 
575 

Tyr Asp He He 
590 

He Leu Gly He 
605 

Arg Leu Val Val 



Pro His Pro Asn 
640 

Pro Phe Tyr Gly 
655 

His Leu He Phe 
670 

Lys Leu Ser Gly 
685 

Asp Val Ser Val 



Asp Ala Leu Met 
720 

Cys Asn Thr Cys 
735 
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Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu 
740 745 750 

Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
755 760 765 

Arg Thr Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly 
770 775 780 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
785 790 795 800 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
805 810 815 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
820 825 830 

His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp 
835 840 845 

Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
850 855 860 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
865 870 875 880 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
885 890 895 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
900 905 910 

Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met 
915 920 925 

Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
930 935 940 

Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
945 950 955 960 

lie Val Gly Arg lie He Leu Ser Gly Arg Pro Ala He Val Pro Asp 
965 970 975 

Arg Glu Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
980 985 990 

His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
995 1000 1005 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
1010 1015 1020 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
025 1030 1035 1040 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
1045 1050 1055 
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Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe 
1060 1065 1070 

Thr Ala Ser lie Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
1075 1080 1085 

Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
1090 1095 1100 

Ala Ser Ala Phe Val Gly Ala Gly lie Ala Gly Ala Ala Val Gly Ser 
105 1110 1115 1120 

lie Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 
1125 1130 1135 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
1140 1145 . H50 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu 
1155 1160 1165 

Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly 
1170 H75 H80 

Ala Leu Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu 
185 H90 1195 1200 

Ser Leu Gly He He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn 
1205 1210 1215 

Arg Glu Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala 
1220 1225 1230 

Gin Thr Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly 
1235 1240 1245 

Val Ser Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp 
1250 1255 1260 

Lys Leu Gly Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val 
265 1270 1275 1280 

Ala Leu Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly 
1285 1290 1295 

Ala Thr Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr 
1300 1305 1310 

He Gly Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg 
1315 1320 1325 

Gly Asn Glu Val He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys 
1330 1335 1340 

Ser Val Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala 
345 1350 1355 1360 

Gly Thr Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp 
1365 1370 1375 
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Val Pro Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp lie Ala Thr Gin 
1380 1385 1390 

Leu He Ser Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys 
1395 1400 1405 

Tyr Met Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr 
1410 1415 1420 

Ser Gin Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp 
425 1430 1435 1440 

Leu Ala Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu 
1445 1450 1455 

Met Gin Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe 
1460 1465 . 1470 

Glu Pro Gly Asp Met Lys Tyr Glu He His Arg Asp Ser Thr Leu Asp 
1475 1480 1485 

Pro Ser Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg 
1490 1495 1500 

Asn Pro Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg He Asp His 
505 " 1510 1515 1520 

Gly His His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr He Met 
1525 1530 1535 

Phe Asp Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp 
1540 1545 1550 

Thr Leu Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly 
1555 1560 1565 

Gly Tyr Pro Leu Arg Gly Ser Cys He Phe Gly Leu Ala Pro Gly Lys 
1570 1575 1580 

Ala Arg Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro 
585 1590 1595 1600 

Gly Tyr Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu 
1605 1610 1615 

Ser Gly Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu 
1620 1625 1630 

Glu Thr His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin 
1635 1640 1645 

Ala His Leu Val His Gly Val Gin Glu Gin Thr Phe He Ala His Val 
1650 1655 1660 

Met Ala Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala 
665 1670 1675 1680 

Pro Pro Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1685 1690 



29 



WO 00/08469 



PCTYUS99/17440 



<210> 5 
<211> 152 
<212> PRT 

<213> Artificial Sequence 
<400> 5 

Met Ser Glu Lys Tyr lie Val Thr Trp Asp Met Leu Gin lie His Ala 
1 5 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Giy lie 
20 25 30 

lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly lie Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe lie Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
8 5 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 
100 105 HO 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
145 150 



<210> 6 
<211> 85 
<212> PRT 

<213> Artificial Sequence 
<400> 6 

Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val Ala Ala 
15 10 15 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu Asn Leu 
20 25 30 

He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
35 40 45 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
50 ~ 55 60 

He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
65 70 75 80 

Cys Tyr He Asp Ser 
85 
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<210> 7 
<211> 286 
<212> PRT 

<213> Artificial Sequence 
<400> 7 

Met Ser lie Gin His Phe Arg Val Ala Leu lie Pro Phe Phe Ala Ala 
15 10 15 

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
20 25 30 

Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr lie Glu Leu Asp 
35 40 45 

Leu Asn Ser Gly Lys lie Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 60 

Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
65 70 75 80 

Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg lie His Tyr Ser 
85 90 95 

Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
100 105 110 

Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
115 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 
130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 
145 150 155 160 

Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala lie Pro Asn Asp Glu Arg 
165 170 175 

Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
180 185 190 

Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
195 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
210 215 220 

Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 
225 230 235 240 

Arg Gly He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He 
24 5 250 255 

Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 
260 265 270 

Arg Gin He Ala Glu He Gly Ala Ser Leu He Lys His Trp 
275 280 285 
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<210> 8 
<211> 13910 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid phcap 3 

<220> 

<221> CDS 

<222> (497) . . (772) 

<220> 
<221> CDS 

<222> (1425) . . (6500) 

<220> 
<221> CDS 

<222> (8579) . . (9034) 

<220> 
<221> CDS 

<222> (10191) . . (10445) 

<220> 
<221> CDS 

<222> (11877) . . (12734) 
<220> 

<221> raisc^feature 
<222> (1) . . (774) 

<223> Vaccinia Virus thymidine Kinase gene recombination 
site 

<220> 

<221> promoter 
<222> (794) . . (816) 
<223> T7 promoter 

<220> 

<221> misc_feature 
<222> (846) . . (1424) 

<223> EMC/Internal Ribosome Entry Site (IRES) 
<220> 

<221> misc_feature 

<222> (1426) . . (1437) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> misc^feature 
<222> (1446) . . (2318) 
<223> HCV E2/ NS2 domain 

<220> 

<221> misc_feature 
<222> (2319) . . (4231) 

<223> HCV NS3 Domain containing the serine protease and 
helicase enzymes 
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<220> 

<221> misc_f eature 

<222> (4203) . . (4260) 

<223> HCV NS3-NS4A cleavage site 

<220> 

<221> misc_f eature 

<222> (4375) . . (4424) 

<223> HCV NS4A-4B clevage site 

<220> 

<221> misc_f eature 
<222> (4233) . . (4394) 
<223> HCV NS4A domain 

<220> 

<221> misc_f eature 
<222> (4395) . . (4919) 
<223> HCV NS4B Domain 

<220> 

<221> misc_feature 

<222> (4920) . . (4991) 

<223> HCV NS5A-NS5B cleavage site 

<220> 

<221> misc_feature 
<222> (4992) . . (6501) 
<223> SEAP Protein 

<220> 

<221> misc_feature 

<222> (7915) . . (7945) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> terminator 
<222> (7938) . . (8078) 
<223> term T7 

<220> 

<221> promoter 
<222> (8080) . . (8365) 

<223> Vacinina virus promoter; early/late promoter 
<220> 

<221> misc_feature 
<222> (8560) . . (11317) 

<223> E. coli gpt; for selection of recombinants 
<220> 

<221> misc_feature 
<222> (11318) . . (13909) 

<223> remaining DNA from 3 T end of Tropix pCMV/SEAP 
plasmid 

<400> 8 

aagcttttgc gatcaataaa tggatcacaa ccagtatctc ttaacgatgt tcttcgcaga 60 
tgatgattca ttttttaagt atttggctag tcaagatgat gaatcttcat tatctgatat 120 
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attgcaaatc actcaatatc tagactttct gttattatta ttgatccaat caaaaaataa 180 

attagaagcc gtgggtcatt gttatgaatc tctttcagag gaatacagac aattgacaaa 240 

attcacagac tttcaagatt ttaaaaaact gtttaacaag gtccctattg ttacagatgg 300 

aagggtcaaa cttaataaag gatatttgtt cgactttgtg attagtttga tgcgattcaa 360 

aaaagaatcc tctctagcta ccaccgcaat agatcctgtt agatacatag atcctcgtcg 420 

caatatcgca ttttctaacg tgatggatat attaaagtcg aataaagtga acaataatta 480 

attctttatt gtcatc atg aac ggc gga cat att cag ttg ata ate ggc ccc 532 
Met Asn Gly Gly His He Gin Leu He He Gly Pro 
1^5 10 



atg ttt tea ggt aaa agt aca gaa tta att aga cga gtt aga cgt tat 
Met Phe Ser Gly Lys Ser Thr Glu Leu He Arg Arg Val Arg Arg Tyr 
15 ^ 20 25 



580 



724 



caa ata get caa tat aaa tgc gtg act ata aaa tat tct aac gat aat 628 
Gin He Ala Gin Tyr Lys Cys Val Thr He Lys Tyr Ser Asn Asp Asn 
30 35 40 

aga tac gga acg gga eta tgg acg cat gat aag aat aat ttt gaa gca 67 6 
Arg Tyr Gly Thr Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala 
45 50 55 60 

ttg gaa gca act aaa eta tgt gat gtc ttg gaa tea att aca gat ttc 
Leu Glu Ala Thr Lys Leu Cys Asp Val Leu Glu Ser He Thr Asp Phe 
65 70 75 

tec gtg ata ggt ate gat gaa gga cag ttc ttt cca gac att gtt gaa 772 
Ser Val He Gly He Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu 
80 85 90 

ttgatctcga tcccgcgaaa ttaatacgac tcactatagg gagaccacaa cggtttccct 832 

etagegggat caattccgcc cctctccctc ccccccccct aacgttactg gccgaagccg 892 

cttggaataa ggccggtgtg cgtttgtcta tatgttattt tccaccatat tgccgtcttt 952 

tggcaatgtg agggecegga aacctggccc tgtcttcttg acgagcattc ctaggggtct 1012 

ttcccctctc gecaaaggaa tgeaaggtet gttgaatgtc gtgaaggaag cagttcctct 1072 

ggaagcttct tgaagacaaa caaegtctgt agcgaccctt tgeaggcage ggaacccccc 1132 

acctggcgac aggtgectet gcggccaaaa gccacgtgta taagatacac ctgeaaagge 1192 

ggcacaaccc cagtgccacg ttgtgagttg gatagttgtg gaaagagtca aatggctctc 1252 

etcaagegta ttcaacaagg ggctgaagga tgcccagaag gtaccccatt gtatgggatc 1312 

tgatctgggg cctcggtgca catgetttae atgtgtttag tcgaggttaa aaaaegtcta 1372 

ggccccccga accaegggga cgtggttttc ctttgaaaaa cacgataata cc atg gga 1430 

Met Gly 
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att ccc caa ttc atg gca cgt gtc tgt gcc tgc ttg tgg atg atg ctg 1478 
He Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu 



95 



100 105 HO 



ctg ata gcc cag gcc gag gcc gcc ttg gag aac ctg gtg gtc etc aat 
Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn 
115 120 125 



1526 



atg gtg etc caa get ggc ata acc aga gtg ccg tac ttc gtg cgc get 
Met Val Leu Gin Ala Gly He Thr Arg Val Pro Tyr Phe Val Arg Ala 
275 280 285 

caa ggg etc att cat gca tgc atg tta gtg egg aag gtc get ggg ggt 
Gin Gly Leu He His Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly 
290 295 300 

cat tat gtc caa atg gcc ttc atg aag ctg ggc gcg ctg aca ggc acg 
His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr Gly Thr 
305 310 315 

tac att tac aac cat ctt acc ccg eta egg gat tgg gcc cac gcg ggc 
Tyr He Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly 
320 325 330 



gcg gcg tct gtg gcc ggc gca cat ggc ate etc tec ttc ctt gtg ttc 1574 
Ala Ala Ser Val Ala Gly Ala His Gly He Leu Ser Phe Leu Val Phe 
130 135 140 

ttc tgt gcc gcc tgg tac ate aaa ggc agg ctg gtc cct ggg gcg gca 1622 
Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly Ala Ala 
145 150 155 

tat get ctt tat ggc gtg tgg ccg ctg etc ctg etc ttg ctg gca tta 1670 
Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu 
160 ' 165 1^0 

cca ccg cga get tac gcc atg gac egg gag atg get gca teg tgc gga 1718 
Pro Pro Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser Cys Gly 
175 " 180 185 190 

ggc gcg gtt ttt gtg ggt ctg gta etc ctg act ttg tea cca tac tac 1766 
Gly Ala Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro Tyr Tyr 
195 200 205 



aag gtg ttc etc get agg etc ata tgg tgg tta caa tat ttt acc acc 1814 
Lys Val Phe Leu Ala Arg Leu He Trp Trp Leu Gin Tyr Phe Thr Thr 
210 215 220 

aga gcc gag gcg cac tta cat gtg tgg ate ccc ccc etc aac get egg 1862 
Arg Ala Glu Ala His Leu His Val Trp He Pro Pro Leu Asn Ala Arg 
225 230 235 

gga ggc cgc gat gcc ate ate etc etc atg tgc gca gtc cat cca gag 1910 
Gly Gly Arg Asp Ala He He Leu Leu Met Cys Ala Val His Pro Glu 
240 ' 245 250 

eta ate ttt gac ate acc aaa ctt eta att gcc ata etc ggt ccg etc 1958 
Leu He Phe Asp He Thr Lys Leu Leu He Ala He Leu GLy Pro Leu 
255 260 265 270 



2006 



2054 



2102 



2150 
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eta cga gac ctt gcg gtg gca gtg gag ccc gtc gtc ttc tec gac atg 
Leu Arg Asp Leu Ala Vai Ala Val Glu Pro Val Val Phe Ser Asp Met 
335 " " 340 345 350 

gag acc aag ate ate ace tgg gga gca gac ace gcg gcg tgt ggg gac 
Glu Thr Lys lie lie Thr Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp 
355 360 365 



2198 



2246 



ate ate ttg ggt ctg ccc gtc tec gee cga agg gga aag gag ata etc 2294 
lie lie Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu lie Leu 
370 375 380 

ctg ggc ccg gee gat agt ctt gaa ggg egg ggg tgg cga etc etc gcg 2342 
Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu Leu Ala 
385 390 395 

ccc ate acg gee tac tec caa cag acg egg ggc eta ctt ggt tgc ate 2390 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
400 405 410 

ate act age ctt aca ggc egg gac aag aac cag gtc gag gga gag gtt 2438 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
415 420 425 430 

cag gtg gtt tec acc gca aca caa tec ttc ctg gcg acc tgc gtc aac 2486 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
435 440 445 

ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc tta gee 2534 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
450 455 460 

ggc cca aag ggg cca ate acc cag atg tac act aat gtg gac cag gac 2582 
Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
465 470 475 

etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca cca tgc 2630 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
480 485 490 

acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get gac gtc 2678 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
495 500 505 510 

att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc tec ccc 2726 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
515 520 525 

agg cct gtc tec tac ttg aag ggc tct gcg ggt ggt cca ctg etc tgc 2774 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys 
530 535 540 

cct teg ggg cac get gtg ggc ate ttc egg get gee gta tgc acc egg 2822 
Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys Thr Arg 
545 550 555 



ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg gaa act 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
560 565 570 



2870 
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act atg egg tct ccg gtc ttc acg gac aac tea tec ccc ccg gec gta 2918 

Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 

575 580 585 590 

ccg cag tea ttt caa gtg gec cac eta cac get ccc act ggc age ggc 2966 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 

595 600 605 



aag agt act aaa gtg ccg get gca tat gca gec caa ggg tac aag gtg 

Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 

610 615 620 

etc gtc etc aat ccg tec gtt gee get acc tta ggg ttt ggg gcg tat 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 

625 630 635 

atg tct aag gca cac ggt att gac ccc aac ate aga act ggg gta agg 

Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 

640 645 650 



gat gag tgc cat tea act gac teg act aca ate ttg ggc ate ggc aca 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie Gly Thr 
690 695 700 



gag gtg gec ctg tct aat act gga gag ate ccc ttc tat ggc aaa gec 
Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr GLy Lys Ala 
735 740 745 750 



3014 



3062 



3110 



acc att acc aca ggc gee ccc gtc aca tac tct acc tat ggc aag ttt 3158 

Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 

655 660 665 670 

ctt gec gat ggt ggt tgc tct ggg ggc get tat gac ate ata ata tgt 3206 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
675 680 685 



3254 



gtc ctg gac caa gcg gag acg get gga gcg egg ctt gtc gtg etc gee 3302 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
705 710 715 

acc get acg cct ccg gga teg gtc acc gtg cca cac cca aac ate gag 3350 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
720 725 730 



3398 



ate ccc att gaa gee ate agg ggg gga agg cat etc att ttc tgt cat 3446 
He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
755 760 765 

tec aag aag aag tgc gac gag etc gec gca aag ctg tea ggc etc gga 3494 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
770 ~ 775 780 

ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc ata cca 3542 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
785 790 795 

act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg acg ggc 3590 
Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
800 805 810 
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tat acg ggc gac ttt gac tea gtg ate gac tgt aac aca tgt gtc ace 3638 
Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr 
815 820 825 830 

cag aca gtc gac ttc age ttg gat ccc ace ttc acc att gag acg acg 3686 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr 
835 840 845 



acc gtg cct caa gac gca gtg teg cgc teg cag egg egg ggt agg act 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
850 855 860 



3734 



ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga gaa egg 
Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly Glu Arg 
865 870 875 



3782 



ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc tat gac gcg 3830 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
880 885 890 

ggc tgt get tgg tac gag etc acc ccc gec gag acc teg gtt agg ttg 3878 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
895 900 905 910 



egg gec tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac cac ctg 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
915 " 920 925 



3926 



gag ttc tgg gag agt gtc tte aca ggc etc acc cat ata gat gca cac 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His 
930 935 940 



3974 



ttc ttg tec cag acc aag cag gca gga gac aac ttc ccc tac ctg gta 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
945 950 955 



4022 



gca tac caa gec acg gtg tgc gec agg get cag gee cca cct cca tea 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
960 965 970 



4070 



tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg ctg cac 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
975 980 985 990 



4118 



ggg cca aca ccc ttg ctg tac agg ctg gga gee gtc caa aat gag gtc 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
995 1000 1005 



4166 



acc etc acc cac ccc ata acc aaa tac ate atg gca tgc atg teg get 
Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala 
1010 1015 1020 



4214 



gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga gtc ctt 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
1025 1030 1035 



4262 



gca get ctg gee gcg tat tgc ctg aca aca ggc agt gtg gtc att gtg 
Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He Val 
1040 1045 1050 



4310 
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ttg ggg ggg tgg gtg get gec caa etc gec ccc ccc age gee get teg 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser 
1185 1190 1195 



ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca gga gtg 
Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly A>a Gly Val 
1215 1220 1225 1230 



gag gat gtc gtc tgc tgc tea atg tec tac aca tgg aca ggc gee ttg 
Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu 
1265 1270 1275 



4502 



ggt agg att ate ttg tec ggg agg ccg gee att gtt ccc gac agg gag 4358 
Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro Asp Arg Glu 
1055 1060 1065 1070 

ctt etc tac cag gag ttc gat gaa atg gaa gag tgc gee teg cac etc 4406 
Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu 
1075 1080 1085 

cct tac ate gag cag gga atg cag etc gee gag caa ttc aag cag aaa 4454 
Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys 
1090 1095 HOO 

gcg etc ggg tta ctg caa aca gee ace aaa caa gcg gag get get get 
Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala 
1105 1110 1115 

ccc gtg gtg gag tec aag tgg cga gec ctt gag aca ttc tgg gcg aag 4550 
Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys 
1120 H25 1130 

cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc tta tec 4598 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
1135 " 1140 1145 1150 

act ctg cct ggg aac ccc gca ata gca tea ttg atg gca ttc aca gee 4646 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
1155 1160 1165 

tct ate ace age ccg etc acc acc caa agt ace etc ctg ttt aac ate 4694 
Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn He 
1170 1175 1180 



4742 



get ttc gtg ggc gee ggc ate gec ggt gcg get gtt ggc age ata ggc 47 90 
Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser lie Gly 
1200 1205 1210 



4838 



gee ggc gcg etc gtg gee ttt aag gtc atg age ggc gag atg ccc tec 4886 
Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser 
1235 1240 1245 

acc gag gac ctg gtc aat eta ctt cct gee ate etc gag gaa get agt 4934 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu Ala Ser 
1250 1255 1260 



4982 



gag ctg ctg ctg ctg ctg ctg ctg ggc ctg agg eta cag etc tec ctg 5030 
Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu 
1280 1285 1290 
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ggc ate ate cca gtt gag gag gag aac ccg gac ttc tgg aac cgc gag 5078 
Gly lie He Pro Val Glu Glu GIu Asn Pro Asp Phe Trp Asn Arg Glu 
1295 1300 1305 1310 

gca gec gag gec ctg ggt gec gec aag aag ctg cag cct gca cag aca 5126 
Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr 
1315 1320 1325 

gec gec aag aac etc ate ate ttc ctg ggc gat ggg atg ggg gtg tct 5174 
Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly Val Ser 
1330 1335 1340 

acg gtg aca get gec agg ate eta aaa ggg cag aag aag gac aaa ctg 5222 
Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp Lys Leu 
1345 1350 1355 

ggg cct gag ata ccc ctg gee atg gac cgc ttc cca tat gtg get ctg 5270 
Gly Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu 
1360 1365 1370 

tec aag aca tac aat gta gac aaa cat gtg cca gac agt gga gec aca 5318 
Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr 
1375 * ^ 1380 1385 1390 

gee acg gee tac ctg tgc ggg gtc aag ggc aac ttc cag ace att ggc 5366 
Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr He Gly 
1395 1400 1405 

ttg agt gca gee gee cgc ttt aac cag tgc aac acg aca cgc ggc aac 5414 
Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn 
1410 1415 1420 

gag gtc ate tec gtg atg aat egg gec aag aaa gca ggg aag tea gtg 54 62 
Glu Val He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val 
1425 1430 1435 

gga gtg gta acc ace aca cga gtg cag cac gee teg cca gee ggc ace 5510 
Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr 
1440 1445 1450 

tac gee cac acg gtg aac cgc aac tgg tac teg gac gee gac gtg cct 5558 
Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro 
1455 1460 1465 1470 

gee teg gec cgc cag gag ggg tgc cag gac ate get acg cag etc ate 5606 
Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin Leu He 
1475 1480 1485 

tec aac atg gac att gac gtg ate eta ggt gga ggc cga aag tac atg 5654 
Ser Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys Tyr Met 
1490 1495 1500 

ttt ccc atg gga acc cca gac cct gag tac cca gat gac tac age caa 5702 
Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin 
1505 1510 1515 

ggt ggg acc agg ctg gac ggg aag aat ctg gtg cag gaa tgg ctg gcg 5750 
Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala 
1520 1525 ' 1530 
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aag cgc cag ggt gcc egg tat gtg tgg aac cgc act gag ctg atg cag 5798 
Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin 
1535 1540 1545 1550 

get tec ctg gac ccg tct gtg ace cat etc atg ggt etc ttt gag cct 5846 
Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro 
1555 1560 1565 

gga gac atg aaa tac gag ate cac cga gac tec aca ctg gac ccc tec 5894 
Gly Asp Met Lys Tyr Glu lie His Arg Asp Ser Thr Leu Asp Pro Ser 
1570 1575 1580 

ctg arg gag atg aca gag get gcc ctg cgc ctg ctg age agg aac ccc 5942 
Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro 
1585 1590 1595 

cgc ggc ttc ttc etc ttc gtg gag ggt ggt cgc ate gac cat ggt cat 5990 
Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His Gly His 
1600 1605 1610 

cat gaa age agg get tac egg gca ctg act gag acg ate atg ttc gac 6038 
His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr lie Met Phe Asp 
1615 1620 1625 1630 

gac gcc att gag agg gcg ggc cag etc ace age gag gag gac acg ctg 6086 
Asp Ala lie Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu 
1635 1640 1645 

age etc gtc act gcc gac cac tec cac gtc ttc tec ttc gga ggc tac 6134 
Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr 
1650 1655 1660 

ccc ctg cga ggg age tg'c ate ttc ggg ctg gcc cct ggc aag gcc egg 6182 
Pro Leu Arg Gly Ser Cys lie Phe Gly Leu Ala Pro Gly Lys Ala Arg 
1665 1670 1675 

gac agg aag gcc tac acg gtc etc eta tac gga aac ggt cca ggc tat 6230 
Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr 
1680 1685 1690 

gtg etc aag gac ggc gcc egg ccg gat gtt ace gag age gag age ggg 6278 
Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly 
1695 1700 1705 1710 

age ccc gag tat egg cag cag tea gca gtg ccc ctg gac gaa gag ace 6326 
Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr 
1715 1720 1725 

cac gca ggc gag gac gtg gcg gtg ttc gcg cgc ggc ccg cag gcg cac 6374 
His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His 
1730 1735 1740 

ctg gtt cac ggc gtg cag gag cag ace ttc ata gcg cac gtc atg gcc 6422 
Leu Val His Gly Val Gin Glu Gin Thr Phe lie Ala His Val Met Ala 
1745 1750 1755 

ttc gcc gcc tgc ctg gag- ccc tac ace gcc tgc gac ctg gcg ccc ccc 6470 
Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro 
1760 1765 1770 
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gcc ggc acc acc gac gcc gcg cac ccg ggt taacccgtgg tccccgcgtt 6520 
Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1775 ~ 1780 

gcttcctctg ctggccggga catcaggtgg cccccgctga attggaatcg atattgttac 6580 
aacaccccaa catcttcgac gcgggcgtgg caggtcttcc cgacgatgac gccggtgaac 6640 
ttcccgccgc cgttgttgtt ttggagcacg gaaagacgat gacggaaaaa gagatcgtgg 6700 
attacgtcgc cagtcaagta acaaccgcga aaaagttgcg cggaggagtt grgtttgtgg 6760 
acgaagtacc gaaaggtctt accggaaaac tcgacgcaag aaaaatcaga gagatcctca 6820 
taaaggccaa gaagggcgga aagtccaaat tgtaaaatgt aactgtatrc agcgatgacg 6880 
aaattcttag ctattgtaat actgcgatga gtggcagggc ggggcgtaat ttttttaagg 6940 
cagttattgg tgcccttaaa cgcctggtgc tacgcctgaa taagtgataa taagcggatg 7000 
aatggcagaa attcgccgga tctttgtgaa ggaaccttac ttctgtggrg tgacataatt 7060 
ggacaaacta cctacagaga tttaaagctc taaggtaaat ataaaatttt taagtgtata 7120 
atgtgttaaa ctactgattc taattgtttg tgtattttag attccaacct atggaactga 7180 
tgaatgggag cagtggtgga atgcctttaa tgaggaaaac ctgttttgct cagaagaaat 7240 
gccatctagt gatgatgagg ctactgctga ctctcaacat tctactcctc caaaaaagaa 7300 
gagaaaggta gaagacccca aggactttcc ttcagaattg ctaagttttt rgagtcatgc 7360 
tgtgtttagt aatagaactc ttgcttgctt tgctatttac accacaaagg aaaaagctgc 7420 
actgctatac aagaaaatta tggaaaaata ttctgtaacc tttataagta ggcataacag 7480 
ttataatcat aacatactgt tttttcttac tccacacagg catagagtgt ctgctattaa 7540 
taactatgct caaaaattgt gtacctttag ctttttaatt tgtaaagggg ttaataagga 7600 
atatttgatg tatagtgcct tgactagaga tcataatcag ccataccaca tttgtagagg 7 660 
ttttacttgc tttaaaaaac ctcccacacc tccccctgaa cctgaaacat aaaatgaatg 7720 
caattgttgt tgttaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 7780 
tcacaaattt cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 7840 
tcatcaatgt atcttatcat gtctggatcc tctagagtcg acctgcaggc atgcaagctt 7900 
ctcgagagta cttctagtgg atccctgcag ctcgagaggc ctaattaatt aagtcgacga 7960 
tccggctgct aacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaata 8020 
actagcataa ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg -8080 
aactatatcc ggagttaact cgacatatac tatatagtaa taccaatact caagactacg 8140 
aaactgatac aatctcttat catgtgggta atgttctcga tgtcgaatag ccatatgccg 8200 
gtagttgcga tatacataaa ■ ctgatcacta attccaaacc cacccgcttt ttatagtaag 8260 
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tttttcaccc ataaataata aatacaataa ttaatttctc gtaaaagtag aaaatatatt 8320 

ctaatttatt gcacggtaag gaagtagaat cataaagaac agtgacggat cgatccccca 8380 

agcttggaca caagacaggc ttgcgagata tgtttgagaa taccacttta tcccgcgtca 8440 

gggagaggca gtgcgtaaaa agacgcggac tcatgtgaaa tactggtttt tagtgcgcca 8500 

gatctctata atctcgcgca acctattttc ccctcgaaca ctttttaagc cgtagataaa 8560 

caggctggga cacttcac atg age gaa aaa tac ate gtc acc tgg gac atg 8611 

Met Ser Glu Lys Tyr lie Val Thr Trp Asp Met 
1785 ' 1790 1795 



ttg cag ate cat gca cgt aaa etc gca age cga ctg atg cct tct gaa 
Leu Gin lie His Ala Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu 
1800 1805 1810 



gcg tta ctg gcg cgt gaa ctg ggt att cgt cat gtc gat acc gtt tgt 
Ala Leu Leu Ala Arg Glu Leu Gly lie Arg His Val Asp Thr Val Cys 
1830 1835 1840 



8659 



caa tgg aaa ggc att att gee gta age cgt ggc ggt ctg gta ccg ggt 8707 
Gin Trp Lys Gly lie lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly 
1815 1820 1825 



8755 



att tec age tac gat cac gac aac cag cgc gag ctt aaa gtg ctg aaa 8803 
He Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys 
1845 1850 1855 

cgc gca gaa ggc gat ggc gaa ggc ttc ate gtt att gat gac ctg gtg 8851 
Arg Ala Glu Gly Asp Gly Glu Gly' Phe lie Val He Asp Asp Leu Val 
1860 1865 1870 1875 

gat acc ggt ggt act gcg gtt gcg att cgt gaa atg tat cca aaa gcg 8899 
Asp Thr Gly Gly Thr Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala 
1880 1885 1890 

cac ttt gtc acc ate ttc gca aaa ccg get ggt cgt ccg ctg gtt gat 8947 
His Phe Val Thr He Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp 
1895 1900 1905 

gac tat gtt gtt gat ate ccg caa gat acc tgg att gaa cag ccg tgg 8 995 
Asp Tyr Val Val Asp He Pro Gin Asp Thr Trp He Glu Gin Pro Trp 
1910 1915 1920 

gat atg ggc gtc gta ttc gtc ccg cca ate tec ggt cgc taatcttttc 9044 
Asp Met Gly Val Val Phe Val Pro Pro He Ser Gly Arg 
1925 1930 1935 

aacgcctggc actgccgggc gttgttcttt ttaacttcag gcgggttaca atagtttcca 9104 

gtaagtattc tggaggctgc atccatgaca caggcaaacc tgagegaaac cctgttcaaa 9164 

ccccgcttta aacatcctga aacctcgacg ctagtccgcc gctttaatca cggcgcacaa 9224 

ccgcctgtgc agtcggccct tgatggtaaa accatccctc actggtatcg catgattaac 9284 

cgtctgatgt ggatctggcg eggcattgae ccacgcgaaa tcctcgacgt ccaggcacgt 9344 
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attgtgatga gcgatgccga acgtaccgac gatgatttat acgatacggt gattggctac 9404 

cgtggcggca actggattta tgagtgggcc ccggatcttt gtgaaggaac cttacttctg 9464 

tggtgtgaca taattggaca aactacctac agagatttaa agctctaagg taaatataaa 9524 

atttttaagt gtataatgtg ttaaactact gattctaatt gtttgtgtat tttagattcc 9584 

aacctatgga actgatgaat gggagcagtg gtggaatgcc tttaatgagg aaaacctgtt 9644 

ttgctcagaa gaaatgccat ctagtgatga tgaggctact gctgactctc aacattctac 9704 

tcctccaaaa aagaagagaa aggtagaaga ccccaaggac t.ttccttcag aattgctaag 9764 

ttttttgagt catgctgtgt ttagtaatag aactcttgct tgctttgcta tttacaccac 9824 

aaaggaaaaa gctgcactgc tatacaagaa aattatggaa aaatattctg taacctttat 9884 

aagtaggcat aacagttata atcataacat actgtttttt cttactccac acaggcatag 9944 

agtgtctgct attaataact atgctcaaaa attgtgtacc tttagctttt taatttgtaa 10004 

aggggttaat aaggaatatt tgatgtatag tgccttgact agagatcata atcagccata 10064 

ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga 10124 

aacataaaat gaatgcaatt gttgttgtta agcttggggg aattgcatgc tccggatcga 10184 

gatcaa ttc tgt gag cgt atg gca aac gaa gga aaa ata gtt ata gta 10232 
Phe Cys Glu Arg Met Ala Asn Glu Gly Lys lie Val lie Val 
1940 1945 1950 

gcc gca etc gat ggg aca ttt caa cgt aaa ccg ttt aat aat att ttg 
Ala Ala Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu 
1955 I960 1965 

aat ctt att cca tta tct gaa atg gtg gta aaa eta act get gtg tgt 
Asn Leu He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys 
1970 1975 1980 

atg aaa tgc ttt aag gag get tec ttt tct aaa cga ttg ggt gag gaa 
Met Lys Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu 
1985 1990 1995 

acc gag ata gaa ata ata gga ggt aat gat atg tat caa teg gtg tgt 
Thr Glu He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys 
2000 2005 2010 

aga aag tgt tac ate gac tea taatattata ttttttatct aaaaaactaa 10475 
Arg Lys Cys Tyr He Asp Ser 
2015 " 2020 

aaataaacat tgattaaatt ttaatataat acttaaaaat ggatgttgtg tegttagata 10535 
aacegtttat gtattttgag gaaattgata atgagttaga ttacgaacca gaaagtgcaa 10595 
atgaggtege aaaaaaactg ccgtatcaag gacagttaaa actattacta ggagaattat 10655 
tttttcttag taagttacag egacaeggta tattagatgg tgccaccgta gtgtatatag 10715 
gatctgctcc eggtacacat ataegttatt tgagagatca tttctataat ttaggagtga 10775 

44 



10280 



10328 



10376 



10424 



WO 00/08469 PCT/US99/17440 



tcatcaaatg 


gatgctaatt 


gacggccgcc 


atcatgatcc 


tattttaaat 


ggattgcgtg 


10835 


atgtgactct 


agtgactcgg 


ttcgttgatg 


aggaatatct 


acgatccatc 


aaaaaacaac 


10895 


tgcatccttc 


taagattatt 


ttaatttctg 


atgtgagatc 


caaacgagga 


ggaaatgaac 


10955 


ctagtacggc 


ggatttacta 


agtaattacg 


ctctacaaaa 


tgtcatgatt 


agtattttaa 


11015 


accccgtggc 


gtctagtctt 


aaatggagat 


gcccgtttcc 


agatcaatgg 


atcaaggact 


11075 


tttatatccc 


acacggtaat 


aaaatgttac 


aaccttttgc 


tccttcatat 


tcagggccgt 


11135 


cgttttacaa 


cgtcgtgact 


gggaaaaccc 


tggcgttacc 


caacttaatc 


gccttgcagc 


11195 


acatccccct 


ttcgccagct 


ggcgtaatag 


cgaagaggcc 


cgcaccgatc 


gcccttccca 


11255 


acagttgcgc 


agcctgaatg 


gcgaatggcg 


cgacgcgccc 


tgtagcggcg 


cattaagcgc 


11315 


ggcgggtgtg 


gtggttacgc 


gcagcgtgac 


cgctacactt 


gccagcgccc 


tagcgcccgc 


11375 


tcctttcgct 


ttcttccctt 


cctttctcgc 


cacgttcgcc 


ggctttcccc 


gtcaagctct 


11435 


aaatcggggg 


ctccctttag 


ggttccgatt 


tagtgcttta 


cggcacctcg 


accccaaaaa 


11495 


acttgattag 


ggtgatggtt 


cacgtagtgg 


gccatcgccc 


tgatagacgg 


tttttcgccc 


11555 


tttgacgttg 


gagtccacgt 


tctttaatag 


tggactcttg 


ttccaaactg 


gaacaacact 


11615 


caaccctatc 


tcggtctatt 


cttttgattt 


ataagggatt 


ttgccgattt 


cggcctattg 


11675 


gttaaaaaat 


gagctgattt 


aacaaaaatt 


taacgcgaat 


tttaacaaaa 


tattaacgtt 


11735 


tacaatttcc 


caggtggcac 


ttttcgggga 


aatgtgcgcg 


gaacccctat 


ttgtttattt 


11795 


ttctaaatac 


attcaaatat 


gtatccgctc 


atgagacaat 


aaccctgata 


aatgcttcaa 


11855 


taatattgaa 


aaaggaagag 


t atg agt , 


att caa cat 


ttc cgt gtc gcc ctt 


11906 



Met Ser He Gin His Phe Arg Val Ala Leu 
2025 2030 

att ccc ttt ttt gcg gca ttt tgc ctt cct gtt ttt get cac cca gaa 11954 
He Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu 
2035 2040 2045 

acg ctg gtg aaa gta aaa gat get gaa gat cag ttg ggt gca cga gtg 12002 
Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val 
2050 2055 2060 

ggt tac ate gaa ctg gat etc aac age ggt aag ate ctt gag agt ttt 12050 
Gly Tyr He Glu Leu Asp Leu Asn Ser Gly Lys He Leu Glu Ser Phe 
2065 2070 2075 

cgc ccc gaa gaa cgt ttt cca atg atg age act ttt aaa gtt ctg eta 12098 
Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu 
2080 2085 2090 2095 

tgt ggc gcg gta tta tec cgt att gac gcc ggg caa gag caa etc ggt 12146 
Cys Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly 
2100 2105 2110 
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cgc cgc ata cac tat tct cag aat gac ttg gtt gag tac tea cca gtc 12194 
Arg Arg lie His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val 
2115 2120 2125 

aca gaa aag cat ctt acg gat ggc atg aca gta aga gaa tta tgc agt 12242 
Thr Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser 
2130 2135 2140 

get gec ata acc atg agt gat aac act gcg gec aac tta ctt ctg aca 12290 
Ala Ala lie Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr 
2145 2150 2155 

acg ate gga gga ccg aag gag eta acc get ttt ttg cac aac atg ggg 12338 
Thr lie Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly 
2160 2165 2170 2175 

gat cat gta act cgc ctt gat cgt tgg gaa ccg gag ctg aat gaa gee 1238 6 
Asp His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala 
2180 2185 2190 

ata cca aac gac gag cgt gac acc acg atg cct gta gca atg gca aca 12434 
lie Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr 
2195 2200 2205 

acg ttg cgc aaa eta tta act ggc gaa eta ctt act eta get tec egg 12482 
Thr Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg 
2210 2215 2220 

caa caa tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt 12530 
Gin Gin Leu lie Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu 
2225 2230 2235 

ctg cgc teg gec ctt ccg get ggc tgg ttt att get gat aaa tct gga 12578 
Leu Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly 
2240 2245 2250 2255 

gec ggt gag cgt ggg tct cgc ggt ate att gca gca ctg ggg cca gat 12 62 6 
Ala Gly Glu Arg Gly Ser Arg Gly lie lie Ala Ala Leu Gly Pro Asp 
2260 2265 2270 

ggt aag ccc tec cgt ate gta gtt ate tac acg acg ggg agt cag gca' 12674 
Gly Lys Pro Ser Arg lie Val Val lie Tyr Thr Thr Gly Sax Gin Ala 
2275 2280 2285 

act atg gat gaa cga aat aga cag ate get gag ata ggt gee tea ctg 12722 
Thr Met Asp Glu Arg Asn Arg Gin lie Ala Glu lie Gly Ala Ser Leu 
2290 2295 2300 

att aag cat tgg taactgtcag accaagttta ctcatatata ctttagattg 12774 
lie Lys His Trp 
2305 

atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca 12834 
tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga 12894 
tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 12954 
aaccaccgct accageggtg gtttgtttgc eggatcaaga gctaccaact ctttttccga 13014 
aggtaactgg cttcagcaga gegcagatae caaatactgt ccttctagtg tagcegtagt 13074 
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taggccacca 


cttcaagaac 


tctgtagcac 


cgcctacata 


cctcgctctg 


ctaatcctgt 


13134 


taccagtggc 


tgctgccagt 


ggcgataagr 


cgtgtcttac 


cgggttggac 


tcaagacgat 


13194 


agttaccgga 


taaggcgcag 


cggtcgggct 


gaacgggggg 


ttcgtgcaca 


cagcccagct 


13254 


tggagcgaac 


gacctacacc 


gaactgagat 


acctacagcg 


tgagctatga 


gaaagcgcca 


13314 


cgcttcccga 


agggagaaag 


gcggacaggt 


atccggtaag 


cggcagggtc 


ggaacaggag 


13374 


agcgcacgag 


ggagcttcca 


gggggaaacg 


cctggtatct 


ttatagtcct 


gtcgggtttc 


13434 


gccacctctg 


acttgagcgt 


cgatttttgt 


gatgctcgtc 


aggggggcgg 


agcctatgga 


13494 


aaaacgccag 


caacgcggcc 


tttttacggt 


tcctggcctt 


ttgctggcct 


tttgctcaca 


13554 


tgttctttcc 


tgcgttatcc 


cctgattctg 


tggataaccg 


tattaccgcc 


tttgagtgag 


13614 


ctgataccgc 


tcgccgcagc 


cgaacgaccg 


agcgcagcga 


gtcagtgagc 


gaggaagcgg 


13674 


aagagcgccc 


aatacgcaaa 


ccgcctctcc 


ccgcgcgttg 


gccgattcat 


taatgcagct 


13734 


ggcacgacag 


gtttcccgac 


tggaaagcgg 


gcagtgagcg 


caacgcaatt 


aatgrgagtt 


13794 


agctcactca 


ttaggcaccc 


caggctttac 


actttatgct 


tccggctcgt 


atgttgtgtg 


13854 


gaattgtgag 


cggataacaa 


tttcacacag 


gaaacagcta 


tgaccatgat 


tacgcc 


13910 



<210> 9 
<211> 230-7 
<212> PRT 

<213> Artificial Sequence 
<400> 9 

Met Asn Gly Gly His lie Gin Leu He He Gly Pro Met Phe Ser Gly 
15 10 15 

Lys Ser Thr Glu Leu He Arg Arg Val Arg Arg Tyr Gin He Ala Gin 
2 0 25 30 

Tyr Lys Cys Val Thr He Lys Tyr Ser Asn Asp Asn Arg Tyr Gly Thr 
35 40 45 

Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala Leu Glu Ala Thr 
50 55 60 

Lys Leu Cys Asp Val Leu Glu Ser He Thr Asp Phe Ser Val He Gly 
65 70 75 80 

He Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu Met Gly He Pro 
85 90 95 

Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu He 
100 105 110 

Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala 
115 120 125 
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Ser Val Ala Gly Ala His Gly lie Leu Ser Phe Leu Val Phe Phe Cys 
130 135 140 

Ala Ala Trp Tyr lie Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala 
145 150 155 160 

Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 
165 170 175 

Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala 
180 185 190 

Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val 
195 200 205 

Phe Leu Ala Arg Leu lie Trp Trp Leu Gin Tyr Phe Thr Thr Arg Ala 
210 215 220 

Glu Ala His Leu His Val Trp lie Pro Pro Leu Asn Ala Arg Gly Gly 
225 230 235 240 

Arg Asp Ala lie lie Leu Leu Met Cys Ala Val His Pro Glu Leu lie 
245 ' 250 255 

Phe Asp lie Thr Lys Leu Leu lie Ala lie Leu Gly Pro Leu Met Val 
260 265 270 

Leu Gin Ala Gly lie Thr Arg Val Pro Tyr Phe Val Arg Ala Gin Gly 
275 280 285 

Leu lie His Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly His Tyr 
290 295 300 

Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr Gly Thr Tyr lie 
305 310 315 320 

Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly Leu Arg 
325 330 335 

Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser Asp Met Glu Thr 
340 345 350 

Lys lie lie Thr Trp Gly Ala Asp Thr Ala Ala Cys Gly Asp lie lie 
355 360 365 

Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu lie Leu Leu Gly 
370 375 " 380 

Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu Leu Ala Pro lie 
385 390 ~ " 395 400 

Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie lie Thr 
405 410 415 

Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val 
420 425 430 

Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val 
435 440 445 
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Cys Trp Thr Val 
450 

Lys Gly Pro lie 
465 

Gly Trp Gin Ala 



Gly Ser Ser Asp 
500 

Val Arg Arg Arg 
515 

Val Ser Tyr Leu 
530 

Gly His Ala Val 
545 

Ala Lys Ala Val 



Arg Ser Pro Val 
58 0 

Ser Phe Gin Val 
595 

Thr Lys Val Pro 
610 

Leu Asn Pro Ser 
625 

Lys Ala His Gly 



Thr Thr Gly Ala 
660 

Asp Gly Gly Cys 
675 

Cys His Ser Thr 
690 

Asp Gin Ala Glu 
705 

Thr Pro Pro Gly 



Ala Leu Ser Asn 
740 

lie Glu Ala lie 
755 



Tyr His Gly Ala 
455 

Thr Gin Met Tyr 
470 

Pro Pro Gly Ala 
485 

Leu Tyr Leu Val 



Gly Asp Ser Arg 
520 

Lys Gly Ser Ala 
535 

Gly lie Phe Arg 
550 



Asp Phe Val Pro 
565 

Phe Thr Asp Asn 



Ala His Leu His 
600 

Ala Ala Tyr Ala 
615 

Val Ala Ala Thr 
630 

lie Asp Pro Asn 
645 

Pro Val Thr Tyr 



Ser Gly Gly Ala 
680 

Asp Ser Thr Thr 
695 

Thr Ala Gly Ala 
710 



Ser Val Thr Val 
725 

Thr Gly Glu He 



Arg Gly Gly Arg 
7 60 



Gly Ser Lys Thr 
460 

Thr Asn Val Asp 
475 

Arg Ser Leu Thr 
490 

Thr Arg His Ala 
505 

Gly Ser Leu Leu 



Gly Gly Pro Leu 
540 

Ala Ala Val Cys 
555 

Val Glu Ser Met 
570 

Ser Ser Pro Pro 
585 

Ala Pro Thr Gly 



Ala Gin Gly Tyr 
620 

Leu Gly Phe Gly 
635 

He Arg Thr Gly 
650 

Ser Thr Tyr Gly 
665 

Tyr Asp He He 



He Leu Gly He 
700 

Arg Leu Val Val 
715 

Pro His Pro Asn 
730 

Pro Phe Tyr Gly 
745 

His Leu He Phe 



Leu Ala Gly Pro 



Gin Asp Leu Val 
480 

Pro Cys Thr Cys 
4 95 

Asp Val He Pro 
510 

Ser Pro Arg Pro 
525 

Leu Cys Pro Ser 



Thr Arg Gly Val 
560 

Glu Thr Thr Met 
575 

Ala Val Pro Gin 
590 

Ser Gly Lys Ser 
605 

Lys Val Leu Val 



Ala Tyr Met Ser 
640 

Val Arg Thr He 
655 

Lys Phe Leu Ala 
670 
* 

lie Cys Asp Glu 
685 

Gly Thr Val Leu 



Leu Ala Thr Ala 
720 

He Glu Glu Val 
735 

Lys Ala He Pro 
750 

Cys His Ser Lys 
765 
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Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly lie Asn 
770 775 780 

Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr lie 
785 790 795 800 

Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr 
805 810 815 

Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr 
820 825 830 

Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr Thr Val 
835 840 845 

Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg 
850 855 860 

Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser 
865 " 870 875 880 

Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys 
885 890 895 

Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala 
900 905 910 

Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 
915 920 " 925 

Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu 
930 935 940 

Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr 
945 950 955 960 

Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 
965 970 975 

Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro 
980 985 990 

Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu 
995 1000 1005 

Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu 
1010 1015 1020 

Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala 
025 1030 1035 1040 

Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg 
1045 1050 1055 

lie lie Leu Ser Gly Arg Pro Ala lie Val Pro Asp Arg Glu Leu Leu 
1060 1065 1070 

Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr 
1075 1080 1085 
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lie Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu 
1090 1095 1100 

Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val 
105 1110 1115 1120 

Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys His Met 
1125 1130 1135 

Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1140 1145 H50 

Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He 
1155 H60 1165 

Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn He Leu Gly 
1170 1175 1180 

Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe 
185 H90 1195 1200 

Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He Gly Leu Gly 
1205 1210 1215 

Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly 
1220 1225 1230 

Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu 
1235 1240 1245 

Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu Ala Ser Glu Asp 
1250 1255 1260 

Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Glu Leu 
265 1270 1275 1280 

Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu Gly He 
1285 1290 1295 

He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu Ala Ala 
1300 1305 1310 

* 

Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr Ala Ala 
1315 1320 1325 

Lys Asn Leu lie lie Phe Leu Gly Asp Gly Met Gly Val Ser Thr Val 
1330 1335 1340 

Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp Lys Leu Gly Pro 
345 1350 1355 1360 

Glu lie Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu Ser Lys 
1365 1370 1375 

Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr Ala Thr 
- 1380 1385 1390 

Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr He Gly Leu Ser 
1395 1400 1405 
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Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn Glu Val 
1410 1415 " 1420 

He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val Gly Val 
425 1430 1435 1440 

Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr Tyr Ala 
1445 1450 1455 

His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro Ala Ser 
1460 1465 1470 

Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin Leu He Ser Asn 
1475 1480 1485 

Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys Tyr Met Phe Pro 
1490 1495 1500 

Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin Gly Gly 
505 1510 1515 1520 

Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala Lys Arg 
1525 1530 1535 

Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin Ala Ser 
1540 1545 1550 

Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro Gly Asp 
1555 1560 1565 

Met Lys Tyr Glu He His Arg Asp Ser Thr Leu Asp Pro Ser Leu Met 
1570 1575 * 1580 

Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro Arg Gly 
585 1590 1595 1600 

Phe Phe Leu Phe Val Glu Gly Gly Arg He Asp His Gly His His Glu 
1605 1610 1615 

Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr He Met Phe Asp Asp Ala 
1620 1625 1630 

* 

He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu Ser Leu 
1635 1640 1645 

Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr Pro Leu 
1650 1655 1660 

Arg Gly Ser Cys He Phe Gly Leu Ala Pro Gly Lys Ala Arg Asp Arg 
665 1670 1675 1680 

Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr Val Leu 
1685 1690 " 1695 

Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly Ser Pro 
1700 1705 1710 

Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr His Ala 
1715 1720 1725 
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Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His Leu Val 
1730 1735 1740 

His Gly Val Gin Glu Gin Thr Phe He Ala His Val Met Ala Phe Ala 
745 1750 1755 1760 

Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro Ala Gly 
1765 1770 1775 

Thr Thr Asp Ala Ala His Pro Gly Met Ser Glu Lys Tyr He Val Thr 
1780 785 1790 

Trp Asp Met Leu Gin He His Ala Arg Lys Leu Ala Ser Arg Leu Met 
1795 1800 1805 

Pro Ser Glu Gin Trp Lys Gly He lie Ala Val Ser Arg Gly Gly Leu 
1810 1815 1820 

Val Pro Gly Ala Leu Leu Ala Arg Glu Leu Gly He Arg His Val Asp 
825 1830 1835 1840 

Thr Val Cys He Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys 
1845 1850 1855 

Val Leu Lys Arg Ala Glu Gly Asp Gly Glu Gly Phe He Val He Asp 
1860 1865 1870 

Asp Leu Val Asp Thr Gly Gly Thr Ala Val Ala He Arg Glu Met Tyr 
1875 1880 1885 

Pro Lys Ala His Phe Val Thr He Phe Ala Lys Pro Ala Gly Arg Pro 
1890 1895 1900 

Leu Val Asp Asp Tyr Val Val Asp He Pro Gin Asp Thr Trp He Glu 
905 1910 1915 1920 

Gin Pro Trp Asp Met Gly Val Val Phe Val Pro Pro He Ser Gly Arg 
1925 1930 1935 

Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val Ala Ala 
1940 1945 1950 

» 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu Asn Leu 
1955 1960 1965 

He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
1970 1975 1980 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
985 1990 1995 2000 

He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
2005 2010 2015 

Cys Tyr He Asp Ser Met Ser He Gin His Phe Arg Val Ala Leu He 
2020 2025 2030 

Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr 
2035 2040 2045 
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Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly 
2050 2055 2060 

Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu Ser Phe Arg 
065 2070 2075 208 

Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys 
2085 2090 2095 

Gly Ala Val Leu Ser Arg lie Asp Ala Gly Gin Glu Gin Leu Gly Arg 
2100 2105 2110 

Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 
2115 ' 2120 2125 

Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala 
2130 2135 2140 

Ala He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr 
145 2150 2155 216 

lie Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp 
2165 2170 2175 

His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He 
2180 2185 2190 

Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr 
2195 2200 2205 

Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin 
2210 2215 2220 

Gin Leu He Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu 
225 2230 2235 224 

Arg Ser Ala Leu Pro Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala 
2245 2250 2255 

Gly Glu Arg Gly Ser Arg Gly He He Ala Ala Leu Gly Pro Asp Gly 
2260 2265 2270 

*» 

Lys Pro Ser Arg He Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr 
2275 2280 2285 

Met Asp Glu Arg Asn Arg Gin He Ala Glu He Gly Ala Ser Leu He 
2290 2295 2300 

Lys His Trp 
305 

<210> 10 
<211> 92 
<212> PRT 

<213> Artificial Sequence 
<400> 10 

Met Asn Gly Gly His He Gin Leu He He Gly Pro Met Phe Ser Gly 
15 10 15 
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Lys Ser Thr Glu Leu lie Arg Arg Val Arg Arg Tyr Gin lie Ala Gin 
20 25 30 

Tyr Lys Cys Val Thr lie Lys Tyr Ser Asn Asp Asn Arg Tyr Gly Thr 
35 40 45 

Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala Leu Glu Ala Thr 
50 * 55 60 

Lys Leu Cys Asp Val Leu Glu Ser lie Thr Asp Phe Ser Val lie Gly 
65 70 75 80 

He Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu 
85 90 



<210> 11 
<211> 1692 
<212> PRT 

<213> Artificial Sequence 
<400> 11 

Met Gly He Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met 
1 5 10 15 

Met Leu Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val 
20 25 30 

Leu Asn Ala Ala Ser Val Ala Gly Ala His Gly He Leu Ser Phe Leu 
35 40 45 

Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly 
50 55 60 

Ala Ala Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu 
65 70 75 80 

Ala Leu Pro Pro Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser 
85 90 95 

Cys Gly Gly Ala Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro 
100 105 1U0 

Tyr Tyr Lys Val Phe Leu Ala Arg Leu He Trp Trp Leu Gin Tyr Phe 
115 120 125 

Thr Thr Arg Ala Glu Ala His Leu His Val Trp He Pro Pro Leu Asn 
130 135 140 

Ala Arg Gly Gly Arg Asp Ala He He Leu Leu Met Cys Ala Val His 
145 150 155 160 

Pro Glu Leu He Phe Asp He Thr Lys Leu Leu He Ala He Leu Gly 
165 170 175 

Pro Leu Met Val Leu Gin Ala Gly He Thr Arg Val Pro Tyr Phe Val 
180 185 190 

Arg Ala Gin Gly Leu He His Ala Cys Met Leu Val Arg Lys Val Ala 
195 200 205 
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Gly Gly His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr 
210 215 220 

Gly Thr Tyr lie Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His 
225 230 235 240 

Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser 
245 250 255 

Asp Met Glu Thr Lys lie lie Thr Trp Gly Ala Asp Thr Ala Ala Cys 
260 265 270 

Gly Asp lie lie Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu 
275 280 285 

lie Leu Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu 
290 295 300 

Leu Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
305 310 315 320 

Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
325 330 335 

Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 
340 345 350 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 
355 360 365 

Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
370 375 380 

Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 
385 390 395 400 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
405 410 415 

Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
420 425 430 

Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly Pro Leu 
435 440 445 

Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys 
450 455 460 

Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
465 470 475 480 

Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
485 490 495 

Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
500 505 510 

Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
515 520 525 
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Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
530 535 540 

Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly 
545 550 555 560 

Val Arg Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
565 570 575 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie 
580 585 590 

lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie 
595 600 605 

Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
610 615 620 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
625 630 " 635 640 

lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly 
645 650 655 

Lys Ala lie Pro lie Glu Ala lie Arg Gly Gly Arg His Leu lie Phe 
660 665 670 

Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
675 J 68Q 685 

Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
690 695 700 

lie Pro Thr lie Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
705 710 715 720 

Thr Gly Tyr Thr. Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys 
725 730 735 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu 
740 " 745 *" 750 

Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
755 760 765 

Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
770 775 780 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
785 790 795 800 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
805 810 815 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
820 825 830 



His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 
835 840 845 
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Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
850 855 860 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
865 870 875 880 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
885 890 ~ ^ 895 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
900 905 910 

Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met 
915 920 925 

Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
930 935 940 

Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
945 950 955 960 

lie Val Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro Asp 
965 970 975 

, Arg Glu Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
980 985 990 

His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
995 1000 1005 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
1010 1015 1020 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
025 1030 1035 1040 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
1045 1050 1055 

Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe 
1060 1065 1070 

Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
1075 1080 1085 

Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
1090 1095 1100 

Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 
105 1110 1115 1120 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 
1125 1130 1135 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
1140 1145 1150 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu 
1155 1160 1165 
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Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Glv 
1170 H75 H80 

Ala Leu Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu 
185 H90 H95 1200 

Ser Leu Gly He He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn 
1205 1210 1215 

Arg Glu Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala 
1220 1225 1230 

Gin Thr Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly 
1235 1240 1245 

Val Ser Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp 
1250 1255 1260 

Lys Leu Gly Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val 
265 1270 1275 1280 

Ala Leu Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly 
1285 1290 1295 

Ala Thr Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr 
1300 1305 1310 

He Gly Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg 
1315 1320 1325 

Gly Asn Glu Val He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys 
1330 1335 1340 

Ser Val Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala 
345 1350 1355 1360 

Gly Thr Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp 
1365 1370 1375 

Val Pro Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin 
1580 1385 1390 

Leu He Ser Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys 
i39 5 1400 1405 

Tyr Met Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tvr 
1410 1415 1420 

Ser Gin Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp 
425 1430 i 43 5 144 £ 

Leu Ala Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu 
1445 1450 1455 

Met Gin Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe 
1460 1465 1470 

Glu Pro Gly Asp Met Lys Tyr Glu He His Arg Asp Ser Thr Leu Asp 
1475 i4 8 o 1485 
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Pro Ser Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg 
1490 1495 1500 

Asn Pro Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His 
505 1510 1515 1520 

Gly His His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr lie Met 
1525 1530 1535 

Phe Asp Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp 
1540 1545 1550 

Thr Leu Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly 
1555 1560 1565 

Gly Tyr Pro Leu Arg Gly Ser Cys He Phe Gly Leu Ala Pro Gly Lys 
1570 1575 1580 

Ala Arg Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro 
585 1590 1595 1600 

Gly Tyr Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu 
1605 1610 1615 

Ser Gly Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu 
1620 1625 1630 

Glu Thr His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin 
1635 1640 1645 

Ala His Leu Val His Gly Val Gin Glu Gin Thr Phe He Ala His Val 
1650 1655 1660 

Met Ala Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala 
665 1670 1675 1680 

Pro Pro Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1685 1690 



<210> 12 
<211> 152 
<212> PRT 

<213> Artificial Sequence 
<400> 12 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 
15 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 
20 25 30 

He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 55 ' 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 
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Gly Glu Gly Phe lie Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
85 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 
100 105 110 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 " 125 



He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
14 5 150 



<210> 13 
<211> 85 
<212> PRT 

<213> Artificial Sequence 
<400> 13 

Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val Ala Ala 
15 10 15 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu Asn Leu 
20 25 30 

He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
35 40 45 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
50 55 60 

He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
65 70 75 80 

Cys Tyr He Asp Ser 
85 



<210> 14 
<211> 286 
<212> PRT 

<213> Artificial Sequence 
<400> 14 

Met Ser He Gin His Phe Arg Val Ala Leu He Pro Phe Phe Ala Ala 
15 10 15 

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
20 25 30 

Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp 
35 40 45 

Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 60 

Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
65 70 75 80 
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Arg He Asp Ala Gly Gin GIu Gin Leu Gly Arg Arg He His Tyr Ser 
85 90 J 95 



Thr Met 
180 



Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 

100 105 "* 110 

Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 

115 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 

130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 

1^5 150 155 160 

Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 

165 • 170 175 

Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 

185 190 

Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 

195 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 

210 215 220 

Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 

225 230 235 " 240 

Arg Gly He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He 

245 250 " ' 255 

Val Val lie Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 

260 265 ~ 270 

Arg Gin lie Ala Glu lie Gly Ala Ser Leu lie Lys His Trp 

275 280 285 



Asp Arg Trp Glu 
Asp Thr 
Thr Gly 



<210> 15 
<211> 13910 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid phcap 4 

<220> 

<221> CDS 

<222> (497) . . (772) 

<220> 
<22i> CDS 

<222> (1425) . . (6500) 

<220> 
<221> CDS 

<222> (8579) . . (9034) 
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<220> 
<221> CDS 

<222> (10191) . . (10445) 

<220> 
<221> CDS 

<222> (11877) . . (12734) 
<220> 

<221> misc_f eature 
<222> (1) . . (774 ) 

<223> Vaccinia Virus thymidine Kinase gene recombination 
site 

<220> 

<221> promoter 
<222> (794) . . (816) 
<223> T7 promoter 

<220> 

<221> misc_feature 
<222> (846) . . (1424) 

<223> EMC/Internal Ribosome Entry Site (IRES) 
<220> 

<221> misc_feature 

<222> (1426) . . (1437) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> misc_feature 
<222> (1446) . . (2318) 
<223> HCV E2/ NS2 domain 

<220> 

<221> misc_feature 
<222> (2319) . . (4231) 

<223> HCV NS3 Domain containing the serine protease and 
helicase enzymes 

<220> 

<221> misc_feature 

<222> (4203) . . (4260) 

<223> HCV NS3-NS4A cleavage site 

<220> 

<221> misc_feature 

<222> (4375) . . (4424) 

<223> HCV NS4A-4B clevage site 

<220> 

<221> misc_feature 
<222> (4233) . . (4394) 
<223> HCV NS4A domain 

<220> 

<221> misc^feature 
<222> (4395) . . (4919) 
<223> HCV NS4B Domain 

<220> 
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<221> misc_f eature 

<222> (4920) . . (4991) 

<223> HCV NS5A-NS5B cleavage site 

<220> 

<221> mi sc_f eature 
<222> (4992) . . (6501) 
<223> SEAP Protein 

<220> 

<221> misc_feature 

<222> (7915) . . (7945) 

<223> MCS (Multiple Cloning Site) 

<220> 

<221> terminator 

<222> (7938) . - (8078) 

<223> term T7 

<220> 

<221> promoter 
<222> (8080) . . (8365) 

<223> Vacinina virus promoter; early/late promoter 
<220> 

<221> misc_feature 
<222> (8560) . . (11317) 

<223> E. coli gpt; for selection of recombinants 
<220> 

<221> misc_feature 
<222> (11318) . . (13909) 

<223> remaining DNA from 3 1 end of Tropix -pCMV/SEAP 
plasmici 



<400> 15 
aagcttttgc 


gatcaataaa 


tggatcacaa 


ccagtatctc 


ttaacgatgt 


tcttcgcaga 


60 


tgatgattca 


ttttttaagt 


atttggctag 


tcaagatgat 


gaatcttcat 


tatctgatat 


120 


attgcaaatc 


actcaatatc 


tagactttct 


gttattatta 


ttgatccaat 


caaaaaataa 


180 


attagaagcc 


gtgggtcatt 


gttatgaatc 


tctttcagag 


gaatacagac 


aattgacaaa 


240 


attcacagac 


tttcaagatt 


ttaaaaaact 


gtttaacaag 


gtccctattg 


ttacagatgg 


300 


aagggtcaaa 


cttaataaag 


gatatttgtt 


cgactttgtg 


attagtttga 


tgcgattcaa 


360 


aaaagaatcc 


tctctagcta 


ccaccgcaat 


agatcctgtt 


agatacatag 


atcctcgtcg 


420 


caatatcgca 


ttttctaacg 


tgatggatat 


attaaagtcg 


aataaagtga 


acaataatta 


480 


attctttatt 


gtcatc atg 
Met 


aac ggc gga cat att cag ttg ata . 
Asn Gly Gly His lie Gin Leu lie 


arc ggc ccc 
lie Gly Pro 


532 



15 10 

atg ttt tea ggt aaa agt aca gaa tta att aga cga gtt aga cgt tat 580 
Met Phe Ser Gly Lys Ser Thr Glu Leu lie Arg Arg Val Arg Arg Tyr 
15 20 25 
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caa ata get caa tat aaa tgc gtg act ata aaa tat tct aac gat aat 628 
Gin lie Ala Gin Tyr Lys Cys Val Thr lie Lys Tyr Ser Asn Asp Asn 
30 35 40 

aga tac gga acg gga eta tgg acg cat gat aag aat aat ttt gaa gca 676 
Arg Tyr Gly Thr Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala 
45 50 55 60 

ttg gaa gca act aaa eta tgt gat gtc ttg gaa tea att aca gat ttc 724 
Leu Glu Ala Thr Lys Leu Cys Asp Val Leu Glu Ser lie Thr Asp Phe 
65 70 75 

tec gtg ata ggt ate gat gaa gga cag ttc ttt cca gac att gtt gaa 772 
Ser Val lie Gly lie Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu 
80 85 90 

ttgatctcga tcccgcgaaa ttaatacgac tcactatagg gagaccacaa cggtttccct 832 

etagegggat caattccgcc cctctccctc ccccccccct aacgttactg gccgaagccg 8 92 

cttggaataa ggccggtgtg cgtttgtcta tatgttattt tccaccatat tgccgtcttt 952 

tggcaatgtg agggecegga aacctggccc tgtcttcttg acgagcattc ctaggggtct 1012 

ttcccctctc gecaaaggaa tgeaaggtet gttgaatgtc gtgaaggaag cagttcctct 1072 

ggaagcttct tgaagacaaa caaegtctgt agcgaccctt tgeaggcage ggaacccccc 1132 

acctggcgac aggtgectet gcggccaaaa gccacgtgta taagatacac ctgeaaagge 1192 

ggcacaaccc cagtgccacg ttgtgagttg gatagttgtg gaaagagtca aatggctctc 1252 

etcaagegta ttcaacaagg ggctgaagga tgcccagaag gtaccccatt gtatgggatc 1312 

tgatctgggg cctcggtgca catgetttae atgtgtttag tcgaggttaa aaaaegtcta 1372 

ggccccccga accaegggga cgtggttttc ctttgaaaaa cacgataata cc atg gga 1430 

Met Gly 

att ccc caa ttc atg gca cgt gtc tgt gee tgc ttg tgg atg atg ctg 1478 
He Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu 
95 100 105 110 

ctg ata gee cag gee gag gee gee ttg gag aac ctg gtg gtc etc aat 1526 
Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn 
115 120 125 

gcg gcg tct gtg gee ggc gca cat ggc ate etc tec ttc ctt gtg ttc 1574 
Ala Ala Ser Val Ala Gly Ala His Gly He Leu Ser Phe Leu Val Phe 
130 135 140 

ttc tgt gee gee tgg tac ate aaa ggc agg ctg gtc cct ggg gcg gca 1622 
Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu Val Pro Gly Ala Ala 
145 150 155 

tat get ctt tat ggc gtg tgg ccg ctg etc ctg etc ttg ctg gca tta 1670 
Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu 
160 165 170 
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cca ccg cga get tac 
Pro Pro Arg Ala Tyr 
175 

ggc gcg gtt ttt gtg 
Gly Ala Val Phe Val 
195 



gec atg gac egg gag 
Ala Met Asp Arg Glu 
180 

ggt ctg gta etc ctg 
Gly Leu Val Leu Leu 
200 



atg get gca teg tgc 
Met Ala Ala Ser Cys 
185 

act ttg tea cca tac 
Thr Leu Ser Pro Tyr 
205 
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gga 1718 

Gly 

190 

tac 17 66 
Tyr 



aag gtg ttc etc get agg etc ata tgg tgg tta caa tat ttt ace ace 
Lys Val Phe Leu Ala Arg Leu lie Trp Trp Leu Gin Tyr Phe Thr Thr 
210 215 220 



1814 



aga gee gag gcg cac tta cat gtg tgg ate ccc ccc etc aac get egg 1862 
Arg Ala Glu Ala His Leu His Val Trp lie Pro Pro Leu Asn Ala Arg 
225 230 235 



gga ggc cgc gat gee ate ate etc etc atg tgc gca gtc cat cca gag 
Gly Gly Arg Asp Ala lie lie Leu Leu Met Cys Ala Val His Pro Glu 
240 245 250 



caa ggg etc att cat gca tgc atg tta gtg egg aag gtc get ggg ggt 
Gin Gly Leu lie His Ala Cys Met Leu Val Arg Lys Val Ala Gly Gly 
290 295 300 



eta cga gac ctt gcg gtg gca gtg gag ccc gtc gtc ttc tec gac atg 
Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser Asp Met 
335 340 345 350 



1910 



eta ate ttt gac ate acc aaa ctt eta att gec ata etc ggt ccg etc 1958 
Leu lie Phe Asp lie Thr Lys Leu Leu lie Ala lie Leu Gly Pro Leu 
255 260 265 270 

atg gtg etc caa get ggc ata acc aga gtg ccg tac ttc gtg cgc get 2006 
Met Val Leu Gin Ala Gly lie Thr Arg Val Pro Tyr Phe Val Arg Ala 
275 280 285 



2054 



cat tat gtc caa atg gee ttc atg aag ctg ggc gcg ctg aca ggc acg 2102 
His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr Gly Thr 
305 310 315 

tac att tac aac cat ctt acc ccg eta egg gat tgg gee cac gcg ggc 2150 
Tyr lie Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His Ala Gly 
320 325 330 



2198 



gag acc aag ate ate acc tgg gga gca gac acc gcg gcg get ggg gac 224 6 
Glu Thr Lys He He Thr Trp Gly Ala Asp Thr Ala Ala Ala Gly Asp 
355 360 365 

ate ate ttg ggt ctg ccc gtc tec gee cga agg gga aag gag ata etc 2294 
He He Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu He Leu 
370 375 380 

ctg ggc ccg gee gat agt ctt gaa ggg egg ggg tgg cga etc etc gcg 2342 
Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu Leu Ala 
385 390 " 395 

ccc ate acg gee tac tec caa cag acg egg ggc eta ctt ggt tgc ate 2390 
Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
400 405 410 
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ate act age ctt aca ggc egg gac aag aac cag gtc gag gga gag gtt 24 38 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
415 420 425 430 

cag gtg gtt tec acc gca aca caa tec ttc ctg gcg acc tgc gtc aac 2486 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
435 440 445 

ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc tta gec 2534 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
4 50 4 55 4 60 

ggc cca aag ggg cca ate acc cag atg tac act aat gtg gac cag gac 2582 
Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
465 470 475 

etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca cca tgc 2630 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
480 485 490 

acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get gac gtc 2678 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
495 500 505 510 

att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc tec ccc 2726 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
515 520 525 

ag.g cct gtc tec tac ttg aag ggc tct gcg ggt ggt cca ctg etc tgc 2774 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys 
530 " 535 540 

cct teg ggg cac get gtg ggc ate ttc egg get gec gta tgc acc egg 2822 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
545 550 555 

ggg gtt g°g aa g g°g gtg gac ttt gtg ccc gta gag tec atg gaa act 2870 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
560 565 570 

act atg egg tct ccg gtc ttc acg gac aac tea tec ccc ccg gee gta 2918 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
575 580 585 590 

ccg cag tea ttt caa gtg gee cac eta cac get ccc act ggc age ggc 2966 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
595 600 605 

aag agt act aaa gtg ccg get gca tat gca gee caa ggg tac aag gtg 3014 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
610 615 620 

etc gtc etc aat ccg tec gtt gee get acc tta ggg ttt ggg gcg tat 3062 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
625 630 635 

atg tct aag gca cac ggt att gac ccc aac ate aga act ggg gta agg 3110 
Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
640 645 650 
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acc att acc aca ggc 
Thr He Thr Thr Gly 
655 

ctt gcc gat ggt ggt 
Leu Ala Asp Gly Gly 
675 

gat gag tgc cat tea 
Asp Glu Cys His Ser 
690 

gtc ctg gac caa gcg 
Val Leu Asp Gin Ala 
705 

acc get acg cct ccg 
Thr Ala Thr Pro Pro 
720 

gag gtg gcc ctg tct 
Glu Val Ala Leu Ser 
735 

ate ccc att gaa gcc 
He Pro He Glu Ala 
755 

tec aag aag aag tgc 
Ser Lys Lys Lys Cys 
770 

ate aac get gtg gcg 
He Asn Ala Val Ala 
785 

act ate gga gac gtc 
Thr He Gly Asp Val 
800 

tat acg ggc gac ttt 
Tyr Thr Gly Asp Phe 
815 

cag aca gtc gac ttc 
Gin Thr Val Asp Phe 
835 

acc gtg cct caa gac 
Thr Val Pro Gin Asp 
850 

ggc agg ggt agg aga 
Gly Arg Gly Arg Arg 
8 65 

ccc teg ggc atg ttc 
Pro Ser Gly Met Phe 
880 



gcc ccc gtc aca tac 
Ala Pro Val Thr Tyr 
660 

tgc tct ggg ggc get 
Cys Ser Gly Gly Ala 
680 

act gac teg act aca 
Thr Asp Ser Thr Thr 
695 

gag acg get gga gcg 
Glu Thr Ala Gly Ala 
710 

gga teg gtc acc gtg 
Gly Ser Val Thr Val 
725 

aat act gga gag ate 
Asn Thr Gly Glu He 
740 

ate agg ggg gga agg 
He Arg Gly Gly Arg 
760 

gac gag etc gcc gca 
Asp Glu Leu Ala Ala 
775 

tat tac egg ggg etc 
Tyr Tyr Arg Gly Leu 
790 

gtt gtc gtg gca aca 
Val Val Val Ala Thr 
805 

gac tea gtg ate gac 
Asp Ser Val He Asp 
820 

age ttg gat ccc acc 
Ser Leu Asp Pro Thr 
840 

gca gtg teg cgc teg 
Ala Val Ser Arg Ser 
855 

ggc ate tac agg ttt 
Gly He Tyr Arg Phe 
870 

gat tec teg gtc ctg 
Asp Ser Ser Val Leu 



tct acc tat ggc aag 
Ser Thr Tyr Gly Lys 
665 

tat gac ate ata ata 
Tyr Asp He He He 
685 

ate ttg ggc ate ggc 
He Leu Gly He Gly 
700 

egg ctt gtc gtg etc 
Arg Leu Val Val Leu 
715 

cca cac cca aac ate 
Pro His Pro Asn He 
730 

ccc ttc tat ggc aaa 
Pro Phe Tyr Gly Lys 
7 4 5 

cat etc att ttc tgt 
His Leu He Phe Cys 
765 

aag ctg tea ggc etc 
Lys Leu Ser Gly Leu 
780 

gat gtg tec gtc ata 
Asp Val Ser Val He 
795 

gac get ctg atg acg 
Asp Ala Leu Met Thr 
810 

tgt aac aca tgt gtc 
Cys Asn Thr Cys Val 
825 

ttc acc att gag acg 
Phe Thr He Glu Thr 
845 

cag egg egg ggt agg 
Gin Arg Arg Gly Arg 
860 

gtg act ccg gga gaa 
Val Thr Pro Gly Glu 
875 

tgt gag tgc tat gac 
Cys Glu Cys Tyr Asp 
8 90 
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ttt 3158 

Phe 

670 

tgt 3206 
Cys 



aca 3254 
Thr 



gcc 3302 
Ala 



gag 3350 
Glu 



gcc 3398 

Ala 

750 

cat 3446 
His 



gga 34 94 
Gly 



cca ' 354 2 
Pro 



ggc 3590 
Gly 



acc 3638 

Thr 

830 

acg 3686 
Thr 



act 3734 
Thr 



egg 3782 
Arg 



gcg 3830 
Ala 
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ggc tgt get tgg tac gag etc acc ccc gec gag acc teg gtt agg ttg 3878 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
895 900 905 910 

egg gee tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac cac ctg 3926 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
915 920 925 

gag ttc tgg gag agt gtc ttc aca ggc etc acc cat ata gat gca cac 3974 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
930 935 940 

ttc ttg tec cag acc aag cag gca gga gac aac ttc ccc tac ctg gta 4022 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
945 950 955 

gca tac caa gee acg gtg tgc gee agg get cag gee cca cct cca tea 4070 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
960 965 970 

tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg ctg cac 4118 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
975 980 985 990 

ggg cca aca ccc ttg ctg tac agg ctg gga gee grc caa aat gag gtc 4166 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
995 1000 1005 

acc etc acc cac ccc ata acc aaa tac ate atg gca tgc atg teg get 4214 
Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
1010 1015 1020 

gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga gtc ctt 4262 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
1025 1030 1035 

gca get ctg gee gcg tat tgc ctg aca aca ggc agt gtg gtc att gtg 4310 
Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He Val 
1040 1045 1050 

ggt agg att ate ttg tec ggg agg ccg gee att gtt ccc gac agg gag 4358 
Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro A$p Arg Glu 
1055 1060 1065 1070 

ctt etc tac cag gag ttc gat gaa atg gaa gag tgc gee teg cac etc 4406 
Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu 
1075 1080 1085 



cct tac ate gag cag gga atg cag etc gec gag caa ttc aag cag aaa 
Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin Lys 
1090 1095 1100 



4454 



gcg etc ggg tta ctg caa aca gee acc aaa caa gcg gag get get get 4502 
Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala 
1105 mo ins 

ccc gtg gtg gag tec aag tgg cga gec ctt gag aca ttc tgg gcg aag 4550 
Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys 
1120 1125 H30 
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cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc tta tec 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
1135 1140 1145 1150 



4598 



act ctg cct ggg aac ccc gca ata gca tea ttg atg gca ttc aca gee 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
1155 H60 1165 



4646 



tct ate acc age ccg etc ace acc caa agt ace etc ctg ttt aac ate 
Ser lie Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn He 
1170 1175 1180 



4694 



ttg ggg ggg tgg gtg get gee caa etc gee ccc ccc age gee get teg 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser 
1185 1190 1195 



4742 



get ttc gtg ggc gee ggc ate gee ggt gcg get gtt ggc age ata ggc 
Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He Gly 
1200 1205 1210 



4790 



ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca gga gtg 
Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
1215 1220 1225 1230 



4838 



gee ggc gcg etc gtg gee ttt aag gtc atg age ggc gag atg ccc tec 
Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser 
1235 1240 "* 1245 



4886 



acc gag gac ctg gtc aat eta ctt cct gee ate etc gag gaa get agt 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu Ala Ser 
1250 1255 1260 



4934 



gag gat gtc gtc tgc tgc tea atg tec tac aca tgg aca ggc gee ttg 
Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu 
1265 1270 1275 



4982 



gag ctg ctg ctg ctg ctg ctg ctg ggc ctg agg eta cag etc tec ctg 
Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu 
1280 1285 ~ 1290 



5030 



ggc ate ate cca gtt gag gag gag aac ccg gac ttc tgg aac cgc gag 
Gly lie He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu 
1295 1300 1305 " * 1310 



5078 



gca gec gag gee ctg ggt gee gee aag aag ctg cag cct gca cag aca 
Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr 
1315 1320 1325 



5126 



gee gee aag aac etc ate ate ttc ctg ggc gat ggg atg ggg gtg tct 
Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly Val Ser 
1330 1335 " 1340 



5174 



acg gtg aca get gee agg ate eta aaa ggg cag aag aag gac aaa ctg 
Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp Lys Leu 
1345 1350 1355 



5222 



ggg cct gag ata ccc ctg gec atg gac cgc ttc cca tat gtg get ctg 
Gly Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu 
1360 1365 1370 



5270 
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tec aag aca tac aat gta gac aaa cat gtg cca gac agt gga gec aca 5318 
Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr 
1375 1380 1385 1390 

gec acg gec tac ctg tgc ggg gtc aag ggc aac ttc cag acc att ggc 5366 
Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr lie Gly 
1395 1400 1405 

ttg agt gca gec gec cgc ttt aac cag tgc aac acg aca cgc ggc aac 5414 
Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn 
1410 1415 " 1420 

gag gtc ate tec gtg atg aat egg gec aag aaa gca ggg aag tea gtg 54 62 
Glu Val lie Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val 
1425 1430 1435 

gga gtg gta acc acc aca cga gtg cag cac gee teg cca gee ggc acc 5510 
Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr 
1440 1445 1450 

tac gee cac acg gtg aac cgc aac tgg tac teg gac gee gac gtg cct 5558 
Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro 
1455 1460 1465 1470 

gee teg gec cgc cag gag ggg tgc cag gac ate get acg cag etc ate 5606 
Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp lie Ala Thr Gin Leu lie 
1475 1480 1485 

tec aac atg gac att gac gtg ate eta ggt gga ggc cga aag tac atg 5654 
Ser Asn Met Asp lie Asp Val lie Leu Gly Gly Gly Arg Lys Tyr Met 
1490 1495 1500 

ttt ccc atg gga- acc cca gac cct gag tac cca gat gac tac age caa 5702 
Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin 
1505 1510 1515 

ggt ggg acc agg ctg gac ggg aag aat ctg gtg cag gaa tgg ctg gcg 5750 
Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala 
1520 1525 1530 

aag cgc cag ggt gee egg tat gtg tgg aac cgc act gag ctg atg cag 5798 
Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin 
1535 1540 1545 * 1550 

get tec ctg gac ccg tct gtg acc cat etc atg ggt etc ttt gag cct 5846 
Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro 
1555 1560 1565 

gga gac atg aaa tac gag ate cac cga gac tec aca ctg gac ccc tec 5894 
Gly Asp Met Lys Tyr Glu lie His Arg Asp Ser Thr Leu Asp Pro Ser 
1570 1575 1580 

ctg atg gag atg aca gag get gec ctg cgc ctg ctg age agg aac ccc 5942 
Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn Pro 
1585 1590 1595 

cgc ggc ttc ttc etc ttc gtg gag ggt ggt cgc ate gac cat ggt cat 5990 
Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His Gly His 
1600 1605 1610 
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cat gaa age agg get tac egg gca ctg act gag acg ate atg ttc gac 6038 

His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr lie Met Phe Asp 

1615 1620 1625 1630 



gac gee att gag agg gcg ggc cag etc ace age gag gag gac acg ctg 
Asp Ala lie Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu 
1635 1640 1645 



6086 



age etc gtc act gee gac cac tec cac gtc ttc tec ttc gga ggc tac 6134 
Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr 
1650 1655 1660 

ccc ctg cga ggg age tgc ate ttc ggg ctg gee cct ggc aag gee egg 6182 
Pro Leu Arg Gly Ser Cys lie Phe Gly Leu Ala Pro Gly Lys Ala Arg 
1665 1670 1675 

gac agg aag gee tac acg gtc etc eta tac gga aac ggt cca ggc tat 6230 
Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr 
1680 1685 1690 

gtg etc aag gac ggc gee egg ccg gat gtt acc gag age gag age ggg 6278 
Val Leu Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly 
1695 1700 1705 1710 

age ccc gag tat egg cag cag tea gca gtg ccc ctg gac gaa gag acc 6326 
Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr 
1715 1720 1725 

cac gca ggc gag gac gtg gcg gtg ttc gcg cgc ggc ccg cag gcg cac 6374 
His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His 
1730 1735 1740 

ctg gtt cac ggc gtg cag gag cag acc ttc ata gcg cac gtc atg gee 6422 
Leu Val His Gly Val Gin Glu Gin Thr Phe lie Ala His Val Met Ala 
1745 1750 1755 

ttc gee gee tgc ctg gag ccc tac acc gee tgc gac ctg gcg ccc ccc 6470 
Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro 
1760 1765 1770 

gee ggc acc acc gac gee gcg cac ccg ggt taacccgtgg tccccgcgtt 6520 
Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1775 1780 

gcttcctctg ctggccggga catcaggtgg cccccgctga attggaatcg atattgttac 6580 
aacaccccaa catcttcgac gcgggcgtgg caggtcttcc cgacgatgac geeggtgaac 6640 
ttcccgccgc cgttgttgtt ttggagcacg gaaagacgat gaeggaaaaa gagatcgtgg 6700 
attaegtege cagtcaagta acaaccgcga aaaagttgcg eggaggagtt gtgtttgtgg 6760 
acgaagtacc gaaaggtctt aceggaaaac tegaegcaag aaaaatcaga gagatcctca 6820 
taaaggccaa gaagggcgga aagtccaaat tgtaaaatgt aactgtattc agegatgacg 6880 
aaattcttag ctattgtaat actgegatga gtggcagggc ggggcgtaat ttttttaagg 6940 
cagttattgg tgcccttaaa cgcctggtgc tacgectgaa taagtgataa taagcggatg 7000 
aatggcagaa attegcegga tctttgtgaa ggaaccttac ttctgtggtg tgacataatt 7060 
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ggacaaacta cctacagaga tttaaagctc taaggtaaat ataaaatttt taagtgtata 7120 

atgtgttaaa ctactgattc taattgtttg tgtattttag attccaacct atggaactga 7180 

tgaatgggag cagtggtgga atgcctttaa tgaggaaaac ctgttttgct cagaagaaat 724 0 

gccatctagt gatgatgagg ctactgctga ctctcaacat tctactcctc caaaaaagaa 7300 

gagaaaggta gaagacccca aggactttcc ttcagaattg ctaagttttt tgagtcatgc 7360 

tgtgtttagt aatagaactc ttgcttgctt tgctatttac accacaaagg aaaaagctgc 7420 

actgctatac aagaaaatta tggaaaaata ttctgtaacc tttataagta ggcataacag 7480 

ttataatcat aacatactgt tttttcttac tccacacagg catagagtgt ctgctattaa 7540 

taactatgct caaaaattgt gtacctttag ctttttaatt tgtaaagggg ttaataagga 7600 

atatttgatg tatagtgcct tgactagaga tcataatcag ccataccaca tttgtagagg 7660 

ttttacttgc tttaaaaaac ctcccacacc tccccctgaa cctgaaacat aaaatgaatg 7720 

caattgttgt tgttaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 7780 

tcacaaattt cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 7840 

tcatcaatgt atcttatcat gtctggatcc tctagagtcg acctgcaggc atgcaagctt 7900 

ctcgagagta cttctagtgg atccctgcag ctcgagaggc ctaattaatt aagtcgacga 7960 

tccggctgct aacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaata 8020 

actagcataa ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg 8080 

aactatatcc ggagttaact cgacatatac tatatagtaa taccaatact caagactacg 8140 

aaactgatac aatctcttat catgtgggta atgttctcga tgtcgaatag ccatatgccg 8200 

gtagttgcga tatacataaa ctgatcacta attccaaacc cacccgcttt ttatagtaag 8260 

tttttcaccc ataaataata aatacaataa ttaatttctc gtaaaagtag aaaatatatt 8320 

ctaatttatt gcacggtaag gaagtagaat cataaagaac agtgacggat cgatccccca 8380 

agcttggaca caagacaggc ttgcgagata tgtttgagaa taccacttta tcccgcgtca 8440 

gggagaggca gtgcgtaaaa agacgcggac tcatgtgaaa tactggtttt tagtgcgcca 8500 

gatctctata atctcgcgca acctattttc ccctcgaaca ctttttaagc cgtagataaa 8560 

caggctggga cacttcac atg age gaa aaa tac ate gtc acc tgg gac atg 8611 

Met Ser Glu Lys Tyr lie Val Thr Trp Asp Met 
1785 1790 1795 

ttg cag ate cat gca cgt aaa etc gca age cga ctg atg cct tct gaa 8659 
Leu Gin lie His Ala Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu 
1800 1805 1810 
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caa tgg aaa ggc att att gcc gta age cgt ggc ggt ctg gta ccg ggt 8707 
Gin Trp Lys Gly lie lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly 
1815 1820 1825 

gcg tta ctg gcg cgt gaa ctg ggt att cgt cat gtc gat acc gtt tgt 8755 
Ala Leu Leu Ala Arg Glu Leu Gly lie Arg His Val Asp Thr Val Cys 
1830 1835 1840 

att tec age tac gat cac gac aac cag cgc gag ctt aaa gtg ctg aaa 8803 
lie Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys 
1845 1850 1855 

cgc gca gaa ggc gat ggc gaa ggc ttc ate gtt att gat gac ctg gtg 8851 
Arg Ala Glu Gly Asp Gly Glu Gly Phe lie Val lie Asp Asp Leu Val 
1860 1865 1870 1875 

gat acc ggt ggt act gcg gtt gcg att cgt gaa atg tat cca aaa gcg 88 99 
Asp Thr Gly Gly Thr Ala Val Ala lie" Arg Glu Met Tyr Pro Lys Ala 
1880 1885 1890 

cac ttt gtc acc ate ttc gca aaa ccg get ggt cgt ccg ctg gtt gat 8 947 
His Phe Val Thr lie Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp 
1895 1900 1905 

gac tat gtt gtt gat ate ccg caa gat acc tgg att gaa cag ccg tgg 8995 
Asp Tyr Val Val Asp lie Pro Gin Asp Thr Trp lie Glu Gin Pro Trp 
1910 1915 1920 



gat atg ggc gtc gta ttc gtc ccg cca ate tec ggt cgc taatcttttc 9044 
Asp Met Gly Val Val Phe Val Pro Pro lie Ser Gly Arg 



1925 




1930 


1935 






aacgcctggc 


actgccgggc 


gttgttcttt 


ttaacttcag 


gcgggttaca 


atagtttcca 


9104 


gtaagtattc 


tggaggctgc 


atccatgaca 


caggcaaacc 


tgagegaaac 


cctgttcaaa 


9164 


ccccgcttta 


aacatcctga 


aacctcgacg 


ctagtccgcc 


gctttaatca 


cggcgcacaa 


9224 


ccgcctgtgc 


agtcggccct 


tgatggtaaa 


accatccctc 


actggtatcg 


catgattaac 


9284 


cgtctgatgt 


ggatctggcg 


eggcattgae 


ccacgcgaaa 


tcctcgacgt 


ccaggcacgt 


9344 


attgtgatga 


gegatgeega 


acgtaccgac 


gatgatttat 


aegataeggt 


gattggctac 


9404 


cgtggcggca 


actggattta 


tgagtgggcc 


ccggatcttt 


gtgaaggaac 


cttacttctg 


9464 


tggtgtgaca 


taattggaca 


aactacctac 


agagatttaa 


agctctaagg 


taaatataaa 


9524 


atttttaagt 


gtataatgrg 


ttaaactact 


gattctaatt 


gtttgtgtat 


tttagattcc 


9584 


aacctatgga 


actgatgaat 


gggagcagtg 


gtggaatgcc 


tttaatgagg 


aaaacctgtt 


9644 


ttgctcagaa 


gaaatgecat 


ctagtgatga 


tgaggctact 


gctgactctc 


aacattctac 


9704 


tcctccaaaa 


aagaagagaa 


aggtagaaga 


ccccaaggac 


tttccttcag 


aattgctaag 


9764 


ttttttgagt 


catgctgtgt 


ttagtaatag 


aactcttget 


tgetttgeta 


tttacaccac 


9824 


aaaggaaaaa 


gctgcactgc 


tatacaagaa 


aattatggaa 


aaatattctg 


taacctttat 


9884 


aagtaggcat 


aacagttata 


atcataacat 


actgtttttt 


cttactccac 


acaggcatag 


9944 
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agtgtctgct attaataact atgctcaaaa attgtgtacc tttagctttt taatttgtaa 10004 

aggggttaat aaggaatatt tgatgtatag tgccttgact agagatcata atcagccata 10064 

ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga 10124 

aacataaaat gaatgcaatt gttgttgtta agcttggggg aattgcatgc tccggatcga 10184 

gatcaa ttc tgt gag cgt atg gca aac gaa gga aaa ata gtt ata gta 10232 
Phe Cys Glu Arg Met Ala Asn Glu Gly Lys lie Val He Val 
1940 1945 1950 

gcc gca etc gat ggg aca ttt caa cgt aaa ccg ttt aat aat att ttg 
Ala Ala Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu 
1955 1960 1965 

aat ctt att cca tta tct gaa atg gtg gta aaa eta act get gtg tgt 
Asn Leu He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys 
1970 1975 1980 

atg aaa tgc ttt aag gag get tec ttt tct aaa cga ttg ggt gag gaa 
Met Lys Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu 
1985 1990 1995 

acc gag ata gaa ata ata gga ggt aat gat atg tat caa teg gtg tgt 
Thr Glu He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys 
2000 2005 2010 

aga aag tgt tac ate gac tea taatattata ttttttatct aaaaaactaa 
Arg Lys Cys Tyr He Asp Ser 
2015 "* 2020 

aaataaacat tgattaaatt ttaatataat acttaaaaat ggatgttgtg tegttagata 10535 
aacegtttat gtattttgag gaaattgata atgagttaga ttacgaacca gaaagtgcaa 10595 
atgaggtege aaaaaaactg ccgtatcaag gacagttaaa actattacta ggagaattat 10655 
tttttcttag taagttacag egacaeggta tattagatgg tgccaccgta gtgtatatag 10715 
gatctgctcc eggtacacat ataegttatt tgagagatca tttctataat ttaggagtga 10775 
tcatcaaatg gatgetaatt gacggccgcc atcatgatcc tattttaaat ggattgcgtg 10835 
atgtgactct agtgactcgg ttcgttgatg aggaatatct acgatccatc aaaaaacaac 10895 
tgcatccttc taagattatt ttaatttctg atgtgagatc caaacgagga ggaaatgaac 10955 
etagtaegge ggatttacta agtaattacg ctctacaaaa tgtcatgatt agtattttaa 11015 
accccgtggc gtctagtctt aaatggagat gcccgtttcc agatcaatgg atcaaggact 11075 
tttatatccc acaeggtaat aaaatgttac aaccttttgc tccttcatat teagggcegt 11135 
cgttttacaa cgtcgtgact gggaaaaccc tggcgttacc caacttaatc gccttgcagc 11195 
acatccccct ttcgccagct ggegtaatag egaagaggee cgcaccgatc gcccttccca 11255 
acagttgege agectgaatg gcgaatggcg cgacgcgccc tgtagcggcg cattaagege 11315 
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ggcgggtgtg gtggttacgc gcagcgtgac cgctacactt gccagcgccc tagcgcccgc 11375 

tcctttcgct ttcttccctt cctttctcgc cacgttcgcc ggctttcccc gtcaagctct 11435 

aaatcggggg ctccctttag ggttccgatt tagtgcttta cggcacctcg accccaaaaa 11495 

acttgattag ggtgatggtt cacgtagtgg gccatcgccc tgatagacgg tttttcgccc 11555 

tttgacgttg gagtccacgt tctttaatag tggactcttg ttccaaactg gaacaacact 11615 

caaccctatc tcggtctatt cttttgattt ataagggatt ttgccgattt cggcctattg 11675 

gttaaaaaat gagctgattt aacaaaaatt taacgcgaat tttaacaaaa tattaacgtt 11735 

tacaatttcc caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt 11795 

ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 11855 

taatattgaa aaaggaagag t atg agt att caa cat ttc cgt gtc gcc ctt 11906 

Met Ser lie Gin His Phe Arg Val Ala Leu 
2025 2030 

att ccc ttt ttt gcg gca ttt tgc ctt cct gtt ttt get cac cca gaa 11954 
lie Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu 
2035 2040 2045 

acg ctg gtg aaa gta aaa gat get gaa gat cag ttg ggt gca cga gtg 12002 
Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val 
2050 "* 2055 2060 

ggt tac ate gaa ctg gat etc aac age ggt aag ate ctt gag agt ttt 12050 
Gly Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu Ser Phe 
2065 2070 2075 

cgc ccc gaa gaa cgt ttt cca atg atg age act ttt aaa gtt ctg eta 12098 
Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu 
2080 2085 2090 2095 

tgt ggc gcg gta tta tec cgt att gac gcc ggg caa gag caa etc ggt 1214 6 
Cys Gly Ala Val Leu Ser Arg lie Asp Ala Gly Gin Glu Gin Leu Gly 
2100 2105 2110 

cgc cgc ata cac tat tct cag aat gac ttg gtt gag tac tea cca gtc 12194 
Arg Arg lie His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val 
2115 2120 2125 

aca gaa aag cat ctt acg gat ggc atg aca gta aga gaa tta tgc agt 12242 
Thr Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser 
2130 2135 2140 

get gcc ata ace atg agt gat aac act gcg gcc aac tta ctt ctg aca 12290 
Ala Ala lie Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr 
2145 2150 2155 

acg ate gga gga ccg aag gag eta ace get ttt ttg cac aac atg ggg 12338 
Thr lie Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly 
2160 2165 2170 2175 

gat cat gta act cgc ctt gat cgt tgg gaa ccg gag ctg aat gaa gcc 12386 
Asp His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala 
2180 2185 2190 
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ata cca aac gac gag cgt gac acc acg atg cct gra gca atg gca aca 12434 
lie Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr 
2195 2200 2205 

acg ttg cgc aaa eta tta act ggc gaa eta ctt act eta get tec egg 12482 
Thr Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg 
2210 2215 2220 

caa caa tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt 12530 
Gin Gin Leu lie Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu 
2225 2230 2235 

ctg cgc teg gee ctt ccg get ggc tgg ttt att get gat aaa tct gga 12578 
Leu Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly 
2240 2245 2250 2255 

gee ggt gag cgt ggg tct cgc ggt ate att gca gca ctg ggg cca gat 12626 
Ala Gly Glu Arg Gly Ser Arg Gly lie lie Ala Ala Leu Gly Pro Asp 
2260 2265 2270 

ggt aag ccc tec cgt ate gta gtt ate tac acg acg ggg agt cag gca 12674 
Gly Lys Pro Ser Arg lie Val Val lie Tyr Thr Thr Gly Ser Gin Ala 
2275 2280 2285 

act atg gat gaa cga aat aga cag ate get gag ata ggt gee tea ctg 12722 
Thr Met Asp Glu Arg Asn Arg Gin lie Ala Glu lie Gly Ala Ser Leu 
2290 2295 2300 

att aag cat tgg taactgtcag accaagttta ctcatatata ctttagattg 12774 
lie Lys His Trp 
2305 



atttaaaact 


tcatttttaa 


tttaaaagga 


tctaggtgaa 


gatccttttt 


gataatctca 


12834 


tgaccaaaat 


cccttaacgt 


gagttttcgt 


tccactgagc 


gtcagacccc 


gtagaaaaga 


12894 


tcaaaggatc 


ttcttgagat 


cctttttttc 


tgcgcgtaat 


ctgctgcttg 


caaacaaaaa 


12954 


aaccaccgct 


accageggtg 


gtttgtttgc 


eggatcaaga 


gctaccaact 


ctttttccga 


13014 


aggtaactgg 


cttcagcaga 


gegcagatae 


caaatactgt 


ccttctagtg^ 


tagcegtagt 


13074 


taggccacca 


cttcaagaac 


tctgt agcac 


cgcctacata 


cctcgctctg 


ctaatcctgt 


13134 


taccagtggc 


tgctgccagt 


ggcgataagt 


cgtgtcttac 


cgggttggac 


tcaagacgat 


13194 


agttaccgga 


taaggcgcag 


eggteggget 


gaacgggggg 


ttcgtgcaca 


cagcccagct 


13254 


tggagegaac 


gacctacacc 


gaactgagat 


acctacagcg 


tgagctatga 


gaaagegeca 


13314 


cgcttcccga 


agggagaaag 


gcggacaggt 


ateeggtaag 


eggcagggtc 


ggaacaggag 


13374 


agegcacgag 


ggagcttcca 


gggggaaacg 


cctggtatct 


ttatagtcct 


gtcgggtttc 


13434 


gccacctctg 


acttgagegt 


cgatttttgt 


gatgetegtc 


aggggggegg 


agcctatgga 


13494 


aaaacgccag 


caacgcggcc 


tttttacggt 


tcctggcctt 


ttgctggcct 


tttgetcaca 


13554 


tgttctttcc 


tgcgttatcc 


cctgattctg 


tggataaccg 


tattaccgcc 


tttgagtgag 


13614 
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ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg 13674 
aagagcgccc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcagct 13734 
ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt 137 94 
agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt atgttgtgtg 13854 
gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat tacgcc 13910 

<210> 16 
<211> 2307 
<212> PRT 

<213> Artificial Sequence 
<400> 16 

Met Asn Gly Gly His lie Gin Leu lie lie Gly Pro Met Phe Ser Gly 
15 10 15 

Lys Ser Thr Glu Leu lie Arg Arg Val Arg Arg Tyr Gin lie Ala Gin 
20 25 30 

Tyr Lys Cys Val Thr lie Lys Tyr Ser Asn Asp Asn Arg Tyr Gly Thr 
35 40 45 

Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala Leu Glu Ala Thr 
50 55 60 

Lys Leu Cys Asp Val Leu Glu Ser lie Thr Asp Phe Ser Val lie Gly 
65 70 75 80 

lie Asp Glu Gly Gin Phe Phe Pro Asp lie Val Glu Met Gly lie Pro 
85 90 95 

Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu lie 
100 105 110 

Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala 
115 120 125 

Ser Val Ala Gly Ala His Gly lie Leu Ser Phe Leu Val Phe Phe Cys 
130 135 140 

Ala Ala Trp Tyr lie Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala 
145 150 155 160 

Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 
165 170 175 

Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser Cys Gly Gly Ala 
180 185 190 

Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val 
195 200 205 

Phe Leu Ala- Arg Leu lie Trp Trp Leu Gin Tyr Phe Thr Thr Arg Ala 
210 215 220 

Glu Ala His Leu His Val Trp lie Pro Pro Leu Asn Ala Arg Gly Gly 
225 230 " 235 240 
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Arg Asp Ala lie 



Phe Asp lie Thr 
260 

Leu Gin Ala Gly 
275 

Leu lie His Ala 
290 

Val Gin Met Ala 
305 

Tyr Asn His Leu 



Asp Leu Ala Val 
340 

Lys He He Thr 
355 

Leu Gly Leu Pro 
370 

Pro Ala Asp Ser 
385 

Thr Ala Tyr Ser 



Ser Leu Thr Gly 
420 

Val Ser Thr Ala 
435 

Cys Trp Thr Val 
450 

Lys Gly Pro lie 
465 

Gly Trp Gin Ala 



Gly Ser Ser Asp 
500 

Val Arg Arg Arg 
515 

Val Ser Tyr Leu 
530 

Gly His Ala Val 
545 



He Leu Leu Met 
245 

Lys Leu Leu He 



He Thr Arg Val 
280 

Cys Met Leu Val 
295 

Phe Met Lys Leu 
310 

Thr Pro Leu Arg 
325 

Ala Val Glu Pro 



Trp Gly Ala Asp 
360 

Val Ser Ala Arg 
375 

Leu Glu Gly Arg 
390 

Gin Gin Thr Arg 
405 

Arg Asp Lys Asn 



Thr Gin Ser Phe 
440 

Tyr His Gly Ala 
455 

Thr Gin Met Tyr 
470 

Pro Pro Gly Ala 
485 

Leu Tyr Leu Val 



Gly Asp Ser Arg 

520 

Lys Gly Ser Ala 
535 

Gly He Phe Arg 
550 



Cys Ala Val His 
250 

Ala He Leu Gly 
265 

Pro Tyr Phe Val 



Arg Lys Val Ala 
300 

Gly Ala Leu Thr 
315 

Asp Trp Ala His 
330 

Val Val Phe Ser 
345 

Thr Ala Ala Ala 



Arg Gly Lys Glu 
380 

Gly Trp Arg Leu 
395 

Gly Leu Leu Gly 
410 

Gin Val Glu Gly 
425 

Leu Ala Thr Cys 



Gly Ser Lys Thr 
4 60 

Thr Asn Val Asp 
475 

Arg Ser Leu Thr 
490 

Thr Arg His Ala 
505 

Gly Ser Leu Leu 



Gly Gly Pro Leu 
54 0 

Ala Ala Val Cys 
555 



Pro Glu Leu He 
255 

Pro Leu Met Val 
270 

Arg Ala Gin Gly 
285 

Gly Gly His Tyr 



Gly Thr Tyr He 
320 

Ala Gly Leu Arg 
335 

Asp Met Glu Thr 
350 

Gly Asp He He 
365 

He Leu Leu Gly 



Leu Ala Pro He 
400 

Cys He He Thr 
• s 415 

Glu Val Gin Val 
430 

Val Asn Gly Val 
445 

Leu Ala Gly Pro 



Gin Asp Leu Val 
480 

Pro Cys Thr Cys 
495 

Asp Val He Pro 
510 

Ser Pro Arg Pro 
525 

Leu Cys Pro Ser 



Thr Arg Gly Val 
560 
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Ala Lys Ala Val 



Arg Ser Pro Val 
580 

Ser Phe Gin Val 
595 

Thr Lys Val Pro 
610 

Leu Asn Pro Ser 
625 

Lys Ala His Gly 



Thr Thr Gly Ala 
660 



Asp Gly Gly Cys 
675 

Cys His Ser Thr 
690 

Asp Gin Ala Glu 
705 

Thr Pro Pro Gly 



Ala Leu Ser Asn 
740 

He Glu Ala He 
755 

Lys Lys Cys Asp 
770 

Ala Val Ala Tyr 
785 

Gly Asp Val Val 



Gly Asp Phe Asp 
820 

Val Asp Phe Ser 
835 

Pro Gin Asp Ala 
850 

Gly Arg Arg Gly 
865 



Asp Phe Val Pro 
565 

Phe Thr Asp Asn 



Ala His Leu His 
600 

Ala Ala Tyr Ala 
615 

Val Ala Ala Thr 
630 

He Asp Pro Asn 
645 

Pro Val Thr Tyr 



Ser Gly Gly Ala 
68 0 

Asp Ser Thr Thr 
695 

Thr Ala Gly Ala 
710 

Ser Val Thr Val 
725 

Thr Gly Glu He 



Arg Gly Gly Arg 
760 

Glu Leu Ala Ala 
775 

Tyr Arg Gly Leu 
790 

Val Val Ala Thr 
805 

Ser Val He Asp 



Leu Asp Pro Thr 
840 

Val Ser Arg Ser 
855 

lie Tyr Arg Phe 
870 



Val Glu Ser Met 
570 

Ser Ser Pro Pro 
585 

Ala Pro Thr Gly 



Ala Gin Gly Tyr 
620 

Leu Gly Phe Gly 
635 

He Arg Thr Gly 
650 

Ser Thr Tyr Gly 
665 

Tyr Asp He He 



He Leu Gly He 
700 

Arg Leu Val Val 
715 

Pro His Pro Asn 
730 

Pro Phe Tyr Gly 
745 

His Leu He Phe 



Lys Leu Ser Gly 

780 

Asp Val Ser Val 
795 

Asp Ala Leu Met 
810 

Cys Asn Thr Cys 
825 

Phe Thr He Glu 



Gin Arg Arg Gly 
860 

Val Thr Pro Gly 
875 



Glu Thr Thr Met 
575 

Ala Val Pro Gin 
590 

Ser Gly Lys Ser 
605 

Lys Val Leu Val 



Ala . Tyr Met Ser 
640 

Val Arg Thr He 
655 

Lys Phe Leu Ala 
670 

He Cys Asp Glu 
685 

Gly Thr Val Leu 



Leu Ala Thr Ala 
720 

He Glu Glu Val 
735 

Lys Ala He Pro 
750 

Cys His Ser Lys 
765 

Leu Gly He Asn 



He Pro Thr He 
800 

Thr Gly Tyr Thr 
815 

Val Thr Gin Thr 
830 

Thr Thr Thr Val 
845 

Arg Thr Gly Arg 



Glu Arg Pro Ser 
880 
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Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys 
885 890 895 

Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala 
900 905 910 

Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 
915 920 925 

Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu 
930 935 940 

Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr 
945 950 955 960 

Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 
965 970 975 

Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro 
980 985 ~ 990 

Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu 
995 1000 1005 

Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp Leu 
1010 1015 1020 

Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala 
025 1030 1035 1040 

Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg 
1045 1050 1055 

lie lie Leu Ser Gly Arg Pro Ala lie Val Pro Asp Arg Glu Leu Leu 
10 60 10 65 107 0 

Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr 
1075 1080 1085 



He Glu Gin Gly Met Gin Leu Ala 
1090 1095 

Gly Leu Leu Gin Thr Ala Thr Lys 
105 1110 

Val Glu Ser Lys Trp Arg Ala Leu 
1125 



Glu Gin Phe Lys Gin Lys Ala Leu 
1100 

Gin Ala Glu Ala Ala Ala Pro Val 
1115 1120 

Glu Thr Phe Trp Ala Lys His Met 
1130 1135 



Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
1140 1145 1150 

Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ser He 
1155 1160 1165 

Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn He Leu Gly 
1170 1175 1180 

Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe 
185 1190 1195 1200 
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Val Gly Ala Gly lie Ala Gly Ala Ala Val Gly Ser lie Gly Leu Gly 
1205 1210 1215 

Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly 
1220 1225 1230 

Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro Ser Thr Glu 
1235 1240 1245 

Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu Ala Ser Glu Asp 
1250 1255 1260 

Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Glu Leu 
265 1270 1275 1280 

Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu Ser Leu Gly He 
1285 1290 1295 

He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn Arg Glu Ala Ala 
1300 1305 1310 

Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala Gin Thr Ala Ala 
1315 1320 1325 

Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly Val Ser Thr Val 
1330 1335 1340 

Thr Ala Ala Arg lie Leu Lys Gly Gin Lys Lys Asp Lys Leu Gly Pro 
345 1350 1355 1360 

Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val Ala Leu Ser Lys 
1365 1370 1375 

Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly Ala Thr Ala Thr 
1380 1385 1390 

Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr lie Gly Leu Ser 
1395 1400 1405 

Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg Gly Asn Glu Val 
1410 1415 1420 

He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val Gly Val 
425 1430 1435 1440 

Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala Gly Thr Tyr Ala 
1445 1450 1455 

His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp Val Pro Ala Ser 
1460 1465 1470 

Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin Leu He Ser Asn 
1475 1480 1485 

Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys Tyr Met Phe Pro 
1490 1495 1500 

Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser Gin Gly Gly 
505 1510 1515 1520 
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Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp Leu Ala Lys Arg 
" 1525 1530 * 1535 

Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu Met Gin Ala Ser 
1540 1545 1550 

Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe Glu Pro Gly Asp 
1555 1560 1565 

Met Lys Tyr Glu lie His Arg Asp Ser Thr Leu Asp Pro Ser Leu Met 
1570 1575 1580 

Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg Asn. Pro Arg Gly 
585 1590 1595 1600 

Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His Gly His His Glu 
1605 1610 1615 

Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr lie Met Phe Asp Asp Ala 
1620 1625 1630 

lie Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp Thr Leu Ser Leu 
1635 1640 1645 

Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr Pro Leu 
1650 1655 1660 

Arg Gly Ser Cys lie Phe Gly Leu Ala Pro Gly Lys Ala Arg Asp Arg 
665 1670 1675 1680 

Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr Val Leu 
1685 1690 1695 

Lys Asp Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly Ser Pro 
1700 1705 1710 

Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu Glu Thr His Ala 
1715 1720 1725 

Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin Ala His Leu Val 
1730 1735 1740 

His Gly Val Gin Glu Gin Thr Phe lie Ala His Val Met Ala Phe Ala 
745 1750 1755 1760 

Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro Pro Ala Gly 
1765 1770 1775 

Thr Thr Asp Ala Ala His Pro Gly Met Ser Glu Lys Tyr lie Val Thr 
1780 785 1790 

Trp Asp Met Leu Gin lie His Ala Arg Lys Leu Ala Ser Arg Leu Met 
1795 1800 1805 

Pro Ser Glu Gin Trp Lys Gly lie lie Ala Val Ser Arg Gly Gly Leu 
1810 1815 1820 

Val Pro Gly Ala Leu Leu Ala Arg Glu Leu Gly lie Arg His Val Asp 
825 1830 1835 1840 
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Thr Val Cys lie Ser Ser Tyr Asp His Asp Asn Gin Arg Glu Leu Lys 
1845 1850 1855 

Val Leu Lys Arg Ala Glu Gly Asp Gly Glu Gly Phe He Val He Asp 
I860 1865 1870 

Asp Leu Val Asp Thr Gly Gly Thr Ala Val Ala He Arg Glu Met Tyr 
1875 " 1880 1885 

Pro Lys Ala His Phe Val Thr He Phe Ala Lys Pro Ala Gly Arg Pro 
1890 1895 1900 

Leu Val Asp Asp Tyr Val Val Asp He Pro Gin Asp Thr Trp He Glu 
905 * 1910 1915 1920 

Gin Pro Trp Asp Met Gly Val Val Phe Val Pro Pro He Ser Gly Arg 
1925 1930 1935 

Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val Ala Ala 
1940 1945 1950 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu Asn Leu 
1955 I960 1965 

He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
1970 1975 1980 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
985 ' 1990 1995 2000 

He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
2005 2010 2015 

Cys Tyr He Asp Ser Met Ser He Gin His Phe Arg Val Ala Leu He 
2020 2025 2030 

Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr 
2035 2040 2045 

Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly 
2050 "* 2055 2060 

». 

• Tyr He Glu Leu Asp Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg 
065 2070 2075 208 

Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys 
2085 2090 2095 

Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg 
2100 2105 2110 

Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 
2115 2120 2125 

Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala 
2130 2135 2140 

Ala He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr 
145 2150 2155 216 
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lie Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp 
2165 2170 2175 

His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala lie 
2180 2185 2190 

Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr 
2195 2200 2205 

Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin 
2210 2215 2220 

Gin Leu He Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu 
225 2230 2235 224 

Arg Ser Ala Leu Pro Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala 
2245 2250 2255 

Gly Glu Arg Gly Ser Arg Gly He He Ala Ala Leu Gly Pro Asp Gly 
2260 2265 2270 

Lys Pro Ser Arg He Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr 
2275 2280 2285 

Met Asp Glu Arg Asn Arg Gin He Ala Glu He Gly Ala Ser Leu He 
2290 2295 2300 

Lys His Trp 
305 



<210> 17 
<211> 92 
<212> PRT 

<213> Artificial Sequence 
<400> 17 

Met Asn Gly Gly His He Gin Leu He He Gly Pro Met Phe Ser Gly 
1 5 10 15 

Lys Ser Thr Glu Leu He Arg Arg Val Arg Arg Tyr Gin He Ala Gin 
20 25 

Tyr Lys Cys Val Thr He Lys Tyr Ser Asn Asp Asn Arg Tyr Gly Thr 
35 40 45 

Gly Leu Trp Thr His Asp Lys Asn Asn Phe Glu Ala Leu Glu Ala Thr 
50 " 55 60 

Lys Leu Cys Asp Val Leu Glu Ser He Thr Asp Phe Ser Val lie Gly 
65 70 75 80 

He Asp Glu Gly Gin Phe Phe Pro Asp He Val Glu 
85 90 



<210> 18 
<211> 1692 
<212> PRT 

<213> Artificial Sequence 
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<400> 18 

Met Gly lie Pro Gin Phe Met Ala Arg Val Cys Ala Cys Leu Trp Met 
15 10 15 

Met Leu Leu lie Ala Gin Ala Glu Ala Ala Leu Glu Asn Leu Val Val 
20 25 30 

Leu Asn Ala Ala Ser Val Ala Gly Ala His Gly lie Leu Ser Phe Leu 
35 40 45 

Val Phe Phe Cys Ala Ala Trp Tyr lie Lys Gly Arg Leu Val Pro Gly 
50 55 60 

Ala Ala Tyr Ala Leu Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu 
65 70 75 80 

Ala Leu Pro Pro Arg Ala Tyr Ala Met Asp Arg Glu Met Ala Ala Ser 
85 90 95 

Cys Gly Gly Ala Val Phe Val Gly Leu Val Leu Leu Thr Leu Ser Pro 
100 105 110 

Tyr Tyr Lys Val Phe Leu Ala Arg Leu lie Trp Trp Leu Gin Tyr Phe 
115 120 125 

Thr Thr Arg Ala Glu Ala His Leu His Val Trp He Pro Pro Leu Asn 
130 135 140 

Ala Arg Gly Gly Arg Asp Ala lie He Leu Leu Met Cys Ala Val His 
145 150 155 160 

Pro Glu Leu lie Phe Asp He Thr Lys Leu Leu He Ala He Leu Gly 
165 .170 175 

Pro Leu Met Val Leu Gin Ala Gly He Thr Arg Val Pro Tyr Phe Val 
180 185 190 

Arg Ala Gin Gly Leu He His Ala Cys Met Leu Val Arg Lys Val Ala 
195 200 205 

Gly Gly His Tyr Val Gin Met Ala Phe Met Lys Leu Gly Ala Leu Thr 
210 215 220 

Gly Thr Tyr He Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His 
225 230 235 240 

Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser 
245 250 255 

Asp Met Glu Thr Lys He He Thr Trp Gly Ala Asp Thr Ala Ala Ala 
260 265 270 

Gly Asp He He Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Lys Glu 
275 280 285 

He Leu Leu Gly Pro Ala Asp Ser Leu Glu Gly Arg Gly Trp Arg Leu 
290 295 300 

Leu Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
305 310 315 320 
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Cys lie lie Thr 



Glu Val Gin Val 
340 

Val Asn Gly Val 
355 

Leu Ala Gly Pro 
370 

Gin Asp Leu Val 
385 

Pro Cys Thr Cys 



Asp Val lie Pro 
420 

Ser Pro Arg Pro 
435 

Leu Cys Pro Ser 
450 

Thr Arg Gly Val 
465 

Glu Thr Thr Met 



Ala Val Pro Gin 
500 



Ser Gly Lys Ser 
515 

Lys Val Leu Val 

530 

Ala Tyr Met Ser 
54 5 

Val Arg Thr lie 



Lys Phe Leu Ala 
58 0 

lie Cys Asp Glu 
595 

Gly Thr Val Leu 
610 

Leu Ala Thr Ala 
625 



Ser Leu Thr Gly 
325 

Val Ser Thr Ala 



Cys Trp Thr Val 
360 

Lys Gly Pro lie 
375 

Gly Trp Gin Ala 
390 

Gly Ser Ser Asp 
4 05 

Val Arg Arg Arg 



Val Ser Tyr Leu 
440 

Gly His Ala Val 
455 

Ala Lys Ala Val 
470 

Arg Ser Pro Val 
485 

Ser Phe Gin Val 



Thr Lys Val Pro 
520 

Leu Asn Pro Ser 
535 

Lys Ala His Gly 
550 

Thr Thr Gly Ala 
565 

Asp Gly Gly Cys 



Cys His- Ser Thr 
600 

Asp Gin Ala Glu 
615 

Thr Pro Pro Gly 
630 



Arg Asp Lys Asn 
330 

Thr Gin Ser Phe 
345 

Tyr His Gly Ala 



Thr Gin Met Tyr 
380 

Pro Pro Gly Ala 
395 

Leu Tyr Leu Val 
410 

Gly Asp Ser Arg 
425 

Lys Gly Ser Ala 



Gly lie Phe Arg 
460 

Asp Phe Val Pro 
475 

Phe Thr Asp Asn 
490 

Ala His Leu His 
505 

Ala Ala Tyr Ala 



Val Ala Ala Thr 
540 

lie Asp Pro Asn 
555 

Pro Val Thr Tyr 
570 

Ser Gly Gly Ala 
585 

Asp Ser Thr Thr 



Thr Ala Gly Ala 
620 

Ser Val Thr Val 
635 



Gin Val Glu Gly 
335 

Leu Ala Thr Cys 
350 

Gly Ser Lys Thr 
365 

Thr Asn Val Asp 



Arg Ser Leu Thr 
400 

Thr Arg His Ala 
415 

Gly Ser Leu Leu 
4 30 

Gly Gly Pro Leu 
445 

Ala Ala Val Cys 



Val Glu Ser Met 
480 

Ser Ser Pro Pro 
495 

Ala Pro Thr Gly 
510 

Ala Gin Gly Tyr 
525 

Leu Gly Phe Gly 



lie Arg Thr Gly 
560 

Ser Thr Tyr Gly 
575 

Tyr Asp lie lie 
5 90 

lie Leu Gly He 
605 

Arg Leu Val Val 



Pro His Pro Asn 
640 
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lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly 
645 650 655 

Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe 
660 665 670 

Cys His Ser Lys Lys Lys Cys Aso Glu Leu Ala Ala Lys Leu Ser Gly 
67 5 680 68 5 

Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
690 695 700 

He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
705 710 715 720 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
725 730 735 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
740 745 750 

Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
755 760 765 

Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
770 ' 775 780 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
785 " 790 795 800 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
805 810 815 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
820 825 830 

His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 
835 840 845 

Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
850 855 860 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
865 870 875 880 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 
885 890 895 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
900 905 910 

Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met 
915 920 925 

Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
930 935 940 

Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
945 950 955 960 
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lie Val Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro Asp 
965 970 975 

Arg Glu Leu Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
980 985 990 

His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
995 1000 1005 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
1010 1015 1020 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
025 1030 1035 1040 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
1045 1050 1055 

Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 
1060 1065 1070 

Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
1075 1080 1085 

Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
1090 ~ 1095 1100 

Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 
105 IHO 1H5 .1120 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 
1125 1130 H35 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
1140 1145 H50 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Glu Glu 
1155 1160 H65 

Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly 
1170 H75 1180 

Ala Leu Glu Leu Leu Leu Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu 
185 H90 1195 1200 

Ser Leu Gly He He Pro Val Glu Glu Glu Asn Pro Asp Phe Trp Asn 
1205 1210 1215 

Arg Glu Ala Ala Glu Ala Leu Gly Ala Ala Lys Lys Leu Gin Pro Ala 
1220 1225 1230 

Gin Thr Ala Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly 
1235 1240 1245 

Val Ser Thr Val Thr Ala Ala Arg He Leu Lys Gly Gin Lys Lys Asp 
1250 1255 1260 

Lys Leu Gly Pro Glu He Pro Leu Ala Met Asp Arg Phe Pro Tyr Val 
265 1270 1275 1280 
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Ala Leu Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp Ser Gly 
1285 1290 1295 

Ala Thr Ala Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gin Thr 
1300 1305 1310 

He Gly Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg 
1315 1320 1325 

Gly Asn Glu Val He Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys 
1330 1335 1340 

Ser Val Gly Val Val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala 
345 1350 1355 1360 

Gly Thr Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp 
1365 1370 1375 

Val Pro Ala Ser Ala Arg Gin Glu Gly Cys Gin Asp He Ala Thr Gin 
1380 1385 1390 

Leu He Ser Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys 
1395 1400 1405 

Tyr Met Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr 
1410 1415 1420 

Ser Gin Gly Gly Thr Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp 
425 " ~ 1430 1435 1440 

Leu Ala Lys Arg Gin Gly Ala Arg Tyr Val Trp Asn Arg Thr Glu Leu 
1445 1450 1455 

Met Gin Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe 
1460 1465 1470 

Glu Pro Gly Asp Met Lys Tyr Glu He His Arg Asp Ser Thr Leu Asp 
1475 1480 1485 

Pro Ser Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg 
1490 1495 1500 

Asn Pro Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg lie Asp His 
505 1510 1515 1520 

Gly His His Glu Ser Arg Ala Tyr Arg Ala Leu Thr Glu Thr He Met 
1525 1530 1535 

Phe Asp Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp 
1540 1545 1550 

Thr Leu Ser Leu Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly 
1555 1560 1565 

Gly Tyr Pro Leu Arg Gly Ser Cys He Phe Gly Leu Ala Pro Gly Lys 
1570 1575 1580 

Ala Arg Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro 
585 1590 1595 1600 
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Gly Tyr Val Leu Lys Asp Gly Ala Arg Pro Asp Val "Thr Glu Ser biu 
1605 1610 1615 

Ser Gly Ser Pro Glu Tyr Arg Gin Gin Ser Ala Val Pro Leu Asp Glu 
1620 1625 1630 

Glu Thr His Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin 
1635 1640 1645 

Ala His Leu Val His Gly Val Gin Glu Gin Thr Phe lie Ala His Val 
1650 1655 1660 

Met Ala Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala 
665 1670 1675 1680 

Pro Pro Ala Gly Thr Thr Asp Ala Ala His Pro Gly 
1685 1690 

<210> 19 
<211> 152 
<212> PRT 

<213> Artificial Sequence 
<400> 19 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 
15 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser glu Gin Trp Lys Gly He 
20 25 30 

He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 " 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
85 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 
100 105 HO 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
145 150 



<210> 20 

<211> 85 

<212> PRT 

<213> Artificial 



Sequence 



91 



WO 00/08469 PCT/US99/1 7440 

<400> 20 

Phe Cys Glu Arg Met Ala Asn Glu Gly Lys He Val He Val Ala Ala 
I 5 10 15 

Leu Asp Gly Thr Phe Gin Arg Lys Pro Phe Asn Asn He Leu Asn Leu 
20 25 30 

He Pro Leu Ser Glu Met Val Val Lys Leu Thr Ala Val Cys Met Lys 
35 40 45 

Cys Phe Lys Glu Ala Ser Phe Ser Lys Arg Leu Gly Glu Glu Thr Glu 
50 55 60 

He Glu He He Gly Gly Asn Asp Met Tyr Gin Ser Val Cys Arg Lys 
65 70 75 80 

Cys Tyr He Asp Ser 
85 



<210> 21 
<211> 286 
<212> PRT 

<213> Artificial Sequence 
<400> 21 

Met Ser He Gin His Phe Arg Val Ala Leu He Pro Phe Phe Ala Ala 
1 5 10 15 

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
20 25 30 

Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp • 
35 40 45 

Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 60 

Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
65 70 75 80 

Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg He His Tyr Ser 
85 90 95 

Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
100 105 HO 

Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
115 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 
130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 
145 150 155 160 

Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
165 170 175 

Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
180 185 190 
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Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu lie Asp Trp 
195 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
210 ~ " 215 220 

Ala Gly Trp Phe lie Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 
225 230 235 240 

Arg Gly He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He 
245 250 255 

Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 
260 265 270 

Arg Gin He Ala Glu He Gly Ala Ser Leu He Lys His Trp 
275 280 285 



<210> 22 
<211> 220 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Sac 1/SEAP/Bam 
HI construct 

<400> 22 

gcgcgcgagc tcctgctgct gctgctgctg ggcctgaggc tacagctctc cctgggcatc 60 

atcccagttg aggaggagaa cccggacttc tggaaccgcg aggcagccga ggccctgggt 120 

gccgccaaga agctgcagcc tgcacagaca gccgccaaga acctcatcat cttcctgggc 180 
gatgggatgg gggtgtctac ggtgacagct gccaggatcc 220 

<210> 23 
<211> 88 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: amino acid 
fragment of the HCV polyprotein 

<400> 23 

Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu lie Ala Gin Ala 
15 10 15 

Glu Ala Ala Leu Glu Asn Leu Val Val Leu Asn Ser Ala Ser Val Ala 
20 25 30 

Gly Ala His Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp 
35 40 45 

Tyr He Lys Gly Arg Leu Val Pro Gly Ala Thr Tyr Ala Leu Tyr Gly 
50 55 60 
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Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro Arg Ala Tyr 
65 70 75 80 

Ala Met Asp Arg Glu Met Ala Ala 
85 

<210> 24 
<211> 260 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA fragment 
coding for an amino acid fragment of the HCV 
polyprotein 

<400> 24 ^ cn 

gcacgtgtct gtgcctgctt gtggatgatg ctgctgatag cccaggccga ggccgccttg 60 

gagaacctgg tggtcctcaa tgcggcgtct gtggccggcg cacatggcat cctctccttc 120 
cttgtgttct tctgtgccgc ctggtacatc aaaggcaggc tggtccctgg ggcggcatat 180 
gctctttatg gcgtgtggcc gctgctcctg ctcttgctgg cattaccacc gcgagcttac 240 

260 

gccatggacc gggagatggc 

<210> 25 
<211> 177 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: ammo acid 
fragment of the HCV polyprotein 

<400> 25 

Cvs Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu 
1 5 10 15 

Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin 
20 25 30 

Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu 
35 40 45 

Thr Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr 
50 55 60 

Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu 
65 70 75 80 

Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr 
85 90 95 

Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro 
100 105 HO 
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Pro Ser Ala Ala Ser Ala Phe Val 
115 120 

Val Gly Ser lie Gly Leu Gly Lys 
130 135 

Tyr Gly Ala Gly Val Ala Gly Ala 
145 150 

Gly Glu Met Pro Ser Thr Glu Asp 
165 



Leu 



Gly Ala Gly He Ala Gly Ala Ala 
125 

Val Leu Val Asp He Leu Ala Gly 
140 

Leu Val Ala Phe Lys Val Met Ser 
155 160 

Leu Val Asn Leu Leu Pro Ala He 
170 l 7 ^ 



<210> 26 
<211> 528 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA fragment 
coding for an amino acid fragment of the HCV 
polyprotein 

<400> 26 

tgcgcctcgc acctccctta catcgagcag ggaatgcagc tcgccgagca attcaagcag bu 
aaagcgctcg ggttactgca aacagccacc aaacaagcgg aggctgctgc tcccgtggtg 120 
gagtccaagt ggcgagccct tgagacattc tgggcgaagc acatgtggaa tttcatcagc 180 
gggatacagt acttagcagg cttatccact ctgcctggga accccgcaat agcatcattg 240 
atggcattca cagcctctat caccagcccg ctcaccaccc aaagtaccct cctgtttaac 300 
atcttggggg ggtgggtggc tgcccaactc gcccccccca gcgccgcttc ggctttcgtg 360 
ggcgccggca tcgccggtgc ggctgttggc agcataggcc ttgggaaggt gcttgtggac 420 
attctggcgg gttatggagc aggagtggcc ggcgcgctcg tggcctttaa ggtcatgagc 480 
ggcgagatgc cctccaccga ggacctggtc aatctacttc ctgccatc 528 

<210> 27 
<211> 33 
<212> DNA 
<213> primer 

<400> 27 

gcgcgcgaat tcatggcacg tgtctgtgcc tgc 

<210> 28 
<211> 33 
<212> DNA 
<213> primer 

<400> 28 



95 



WO 00/08469 



PCT/US99/17440 



cgcgcgctcg aggatggcag gaagtagatt gac 



33 



<210> 29 
<211> 20 
<212> PRT 

<213> putative NS5A/5B cleavage site 
<400> 29 

Glu Glu Ala Ser Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 
1 5 10 .15 

Thr Gly Ala Leu 
20 



<210> 30 
<211> 33 
<212> DNA 
<213> primer 



<210> 31 
<211> 36 
<212> DNA 
<213> primer 

<400> 31 

cgcgcggagc tccaaggcgc ctgtccatgt gtagga 36 

<210> 32 
<211> 69 
<212> DNA 
<213> primer 

<400> 32 

ctcgaggaag ctagtgagga tgtcgtctgc tgctcaatgt cctacacatg gacaggcgcc 60 
ttggagctc 69 



<210> 33 
<211> 6 
<212> PRT 

<213> HCV/SEAP 6 amino acid fragment 
<400> 33 

Met Gly He Pro Gin Phe 
1 5 



<400> 30 

gcgcgcctcg aggaagctag tgaggatgtc gtc 
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